101

Skewness-Robust Causal Discovery in Location-Scale Noise Models

Main:7 Pages
11 Figures
Bibliography:4 Pages
8 Tables
Appendix:6 Pages
Abstract

To distinguish Markov equivalent graphs in causal discovery, it is necessary to restrict the structural causal model. Crucially, we need to be able to distinguish cause XX from effect YY in bivariate models, that is, distinguish the two graphs XYX \to Y and YXY \to X. Location-scale noise models (LSNMs), in which the effect YY is modeled based on the cause XX as Y=f(X)+g(X)NY = f(X) + g(X)N, form a flexible class of models that is general and identifiable in most cases. Estimating these models for arbitrary noise terms NN, however, is challenging. Therefore, practical estimators are typically restricted to symmetric distributions, such as the normal distribution. As we showcase in this paper, when NN is a skewed random variable, which is likely in real-world domains, the reliability of these approaches decreases. To approach this limitation, we propose SkewD, a likelihood-based algorithm for bivariate causal discovery under LSNMs with skewed noise distributions. SkewD extends the usual normal-distribution framework to the skew-normal setting, enabling reliable inference under symmetric and skewed noise. For parameter estimation, we employ a combination of a heuristic search and an expectation conditional maximization algorithm. We evaluate SkewD on novel synthetically generated datasets with skewed noise as well as established benchmark datasets. Throughout our experiments, SkewD exhibits a strong performance and, in comparison to prior work, remains robust under high skewness.

View on arXiv
Comments on this paper