Root-cause analysis for time-series anomalies via spatiotemporal causal graphical modeling

Modern distributed cyber-physical systems encounter a large variety of anomalies and in many cases, they are vulnerable to catastrophic fault propagation scenarios due to strong connectivity among the sub-systems. In this regard, root-cause analysis becomes highly intractable due to complex fault propagation mechanisms in combination with diverse operating modes. This paper presents a new data-driven framework for root-cause analysis for addressing such issues. The framework is based on a spatiotemporal feature extraction scheme for multivariate time series built on the concept of symbolic dynamics for discovering and representing causal interactions among subsystems of a complex system. We propose sequential state switching () and artificial anomaly association () methods to implement root-cause analysis in an unsupervised and semi-supervised manner respectively. Synthetic data from cases with failed pattern(s) and anomalous node are simulated to validate the proposed approaches, then compared with the performance of vector autoregressive (VAR) model-based root-cause analysis. The results show that: (1) and approaches can obtain high accuracy in root-cause analysis and successfully handle multiple nominal operation modes, and (2) the proposed tool-chain is shown to be scalable while maintaining high accuracy.
View on arXiv