Masked Gradient-Based Causal Structure Learning

SDM (SDM), 2019

18 October 2019

Abstract

This paper studies the problem of learning causal structures from observational data. We reformulate the Structural Equation Model (SEM) in an augmented form with a binary graph adjacency matrix and show that, if the original SEM is identifiable, then this augmented form can be identified up to super-graphs of the true causal graph under mild conditions. Three methods are further provided to remove spurious edges to recover the true graph. We next utilize the augmented form to develop a masked structure learning method that can be efficiently trained using gradient-based optimization methods, by leveraging a smooth characterization on acyclicity and the Gumbel-Softmax approach to approximate the binary adjacency matrix. It is found that the obtained entries are typically near zero or one, and can be easily thresholded to identify the edges. We conduct experiments on synthetic and real datasets to validate the effectiveness of the proposed method and show that the method can readily include different smooth functions to model causal relationships.

View on arXiv

Comments on this paper

444

131

v1v2v3 (latest)

Masked Gradient-Based Causal Structure Learning

SDM (SDM), 2019

18 October 2019

Abstract

This paper studies the problem of learning causal structures from observational data. We reformulate the Structural Equation Model (SEM) in an augmented form with a binary graph adjacency matrix and show that, if the original SEM is identifiable, then this augmented form can be identified up to super-graphs of the true causal graph under mild conditions. Three methods are further provided to remove spurious edges to recover the true graph. We next utilize the augmented form to develop a masked structure learning method that can be efficiently trained using gradient-based optimization methods, by leveraging a smooth characterization on acyclicity and the Gumbel-Softmax approach to approximate the binary adjacency matrix. It is found that the obtained entries are typically near zero or one, and can be easily thresholded to identify the edges. We conduct experiments on synthetic and real datasets to validate the effectiveness of the proposed method and show that the method can readily include different smooth functions to model causal relationships.

View on arXiv

Comments on this paper

All Papers

Masked Gradient-Based Causal Structure Learning

Masked Gradient-Based Causal Structure Learning