The BigDataStack Solution
The BigDataStack Software Component Catalog
BigDataStack aims at providing a complete infrastructure management system, which will base the management and deployment decisions on data from current and past application and infrastructure deployments. This complete infrastructure management system is delivered as a full“stack” that facilitates the needs of operation data and application.
BigDataStack introduces the paradigm of a new frontrunner data-driven system ensuring that computing, storage and networking resources management will be fully efficient and optimized for data operations and data-intensive applications.
The system will base all infrastructure management decisions on the data aspects and the data operations governing and affecting the interdependencies between storage, compute and network resources.
Data as a Service promotes automation and quality and ensures that the provided data are meaningful, of value and fit-for-purpose through approaches for data cleaning, modelling, interoperability, and efficient storage.
Unique seamless data analytics will be realized in a holistic fashion across multiple data stores and locations, along with advanced modelling techniques defining flexible schemas that can be exploited across processing frameworks.
The workbench facilitates data-focused application analysis and dimensioning in terms of predicting the required data services, their interdependencies with the application micro-services and the required underlying resources.
Allowing the identification of the applications data-related properties and their data needs, it enables the provision of specific performance and quality guarantees.
Framework allowing the flexible, functionality-based modelling of processes, which will be mapped in an automated way to concrete technical-level process mining analytics.
The analytics outcomes will be providing feedback to the business analysts with specific recommendations towards overall process optimization and adaptation.
Data toolkit enabling open and extensibility by providing an environment to data scientists and practitioners to easily ingest their data analytics functions by utilizing a declarative paradigm, as well as to specify their preferences and constraints that will be exploited by the infrastructure management system for resources and data management accordingly.
In its turn, each block is made up of a cluster of software components. Below you can find the technical description, Licenses analysis, main competitors in the field and the open software codes listed on GitHub and GitLab.
Data as a Service
The Process Mapping component provides an automatic algorithm selection for Meta Learning (ML) tasks. The component follows an ML approach, thus it improves its performance as it is applied on increasing amounts of datasets. Reduces time and effort on fin
GPL v3 License
Data-driven Infrastructure Management
The Dynamic Orchestrator triggers the redeployment of applications during runtime to ensure they comply with their Service Level Objectives (SLOs). It uses a Reinforced Learning-based approach which can operate efficiently, with a light overhead for the s
Network Policy Support at Kuryr
By default, all Kubernetes pods accept traffic from any source. Network Policy defines how groups of pods are allowed to communicate with each other and other network endpoints. It also suggests a design for supporting Kubernetes “Network policy” in Kuryr
Apache License 2.0
Process modelling and optimisation framework for business analysts
Application Dimensioning Workbench
The Dimensioning Workbench will benchmark the target service via easily configured and automated parameter sweep tests and gather the necessary performance data, train prediction models that are able to regress for cases that have not been met before
Apache License 2.0