This blog was initially published in Forbes Tech council
Data-centric AI is the new frontier in AI, where the models themselves now remain stationary while tools, techniques and engineering practices improve data quality. As Andrew Ng puts it, “Data-centric AI is the discipline of systematically engineering data to build an AI system.”
AIOps, on the other hand, is the applied AI and machine learning discipline that tackles the challenges of operationalizing and gleaning insights from businesses with distributed hybrid applications, platforms and IT stacks. AIOps data types (metrics, events, logs and traces) are streaming-, event- and alert-based and standard ELT (extract, load, transform) or ETL (extract, transform, load) techniques cannot be used for data preparation in these use cases. AIOps needs to process all data types to improve data quality and derive unique insights, and this can only be done with real-time observability pipelines.
How can these pipelines integrate, ingest and route data across edge, data center and multicloud data sources? How can you enable the automation of streaming these data types? How can you glean insights from and democratize access to data dashboards, APIs and AI/ML pipelines? How can you infer the root cause of and manage incidents and anomalies using these pipelines?
Data-centric AIOps can solve these challenges. Let’s take a look at a few ways it does so.
- Data Integration – Data-centric AIOps can leverage low-code/no-code bots to connect to disparate monitoring, observability and APM (application performance management) sources and ITSM (IT service management) sinks, pulling insights from these datasets. There is a growing user community that develops these bots for new sources and sinks as needed.
- Data Ingestion, Routing and Compliance – Streaming data is then normalized, duplicated, redacted and ingested into the platform over low-latency data fabric across the edge, the core and the multicloud. Full fidelity copy is then routed and retained in low-cost archival storage for compliance, all while routing other processed copies to the right stakeholders.
- Enrichment and Contextualization – The enrichment stage is very crucial and uses automated pipelines for real-time topology discovery and attributes from element managers to create data models called full stack dependency maps, both of which create a context for correlation. Additionally, external feeds like geo-IP lookups, common vulnerabilities, exposures (CVEs) or threat intelligence platforms (TIP) can also enrich data.
- Correlation and Continuous AI/ML – A number of correlation pipelines using techniques like continuous ML; unsupervised, federated and reinforcement learning; clustering; time series-based regression; full-stack topology and attribute-based metadata are used to reduce and suppress noise and get to the root cause inference using an incident recommendation engine. All data types, including alerts, metrics and traces, are used by the recommendation engine to auto-remediate problems, create an incident, prioritize actions and route to the right L0 automation stakeholders.
- Anomaly Detection at Scale – A number of anomaly detection regression pipelines are used for business, application and IT metrics and help build logs for dynamic baselining, detecting anomalies, forecasts and predictions.
- Observability Dashboards – Self-service, dynamic dashboards provide insights, business value and economic impact across observability, AIOps and automation domains.
Use Cases
By using the capabilities above, data-centric AIOPs enables the operationalizing of a number of diverse use cases and provides analytics and insights for a number of operational personas, including the following.
- BusOps: Glean economic impact insights with business value dashboards across full-stack environments.
- ITOps: Monitor key KPIs and their impact on business functions.
- FinOps: Optimize resource utilization and economic benefits for private and public clouds.
- DevOps, DevSecOps And GitOps: Operationalize CI/CD pipelines and infrastructure-as-a-code deployments.
- CloudOps: Operationalize hybrid cloud management across service meshes.
- ServiceOps: Manage incidents by adding context, prioritizing and routing based on NLP.
Transforming AIOPs
Data-centric AIOps is an innovative approach to AIOps. How can you start applying this approach to processes in your business? An RDAF (robotic data automation fabric) platform is one such example of accomplishing data-centric AIOps. Businesses should also evaluate how they can automate their data streams, utilize no-code/low-code pipelines and connect and perform in-place analytics across disparate data sources, no matter where they reside. Unless these challenges are resolved, data-centric AIOps will remain a distant goal.