Causal Discovery over High-Dimensional Structured Hypothesis Spaces with Causal Graph Partitioning

10 June 2024

Abstract

The aim in many sciences is to understand the mechanisms that underlie the observed distribution of variables, starting from a set of initial hypotheses. Causal discovery allows us to infer mechanisms as sets of cause and effect relationships in a generalized way -- without necessarily tailoring to a specific domain. Causal discovery algorithms search over a structured hypothesis space, defined by the set of directed acyclic graphs, to find the graph that best explains the data. For high-dimensional problems, however, this search becomes intractable and scalable algorithms for causal discovery are needed to bridge the gap. In this paper, we define a novel causal graph partition that allows for divide-and-conquer causal discovery with theoretical guarantees. We leverage the idea of a superstructure -- a set of learned or existing candidate hypotheses -- to partition the search space. We prove under certain assumptions that learning with a causal graph partition always yields the Markov Equivalence Class of the true causal graph. We show our algorithm achieves comparable accuracy and a faster time to solution for biologically-tuned synthetic networks and networks up to ${10^4}$ variables. This makes our method applicable to gene regulatory network inference and other domains with high-dimensional structured hypothesis spaces.

View on arXiv

@article{shah2025_2406.06348,
  title={ Causal Discovery over High-Dimensional Structured Hypothesis Spaces with Causal Graph Partitioning },
  author={ Ashka Shah and Adela DePavia and Nathaniel Hudson and Ian Foster and Rick Stevens },
  journal={arXiv preprint arXiv:2406.06348},
  year={ 2025 }
}

Comments on this paper