Density Ratio-based Causal Discovery from Bivariate Continuous-Discrete Data
- CML
We address the problem of inferring the causal direction between a continuous variable and a discrete variable from observational data. For the model , we adopt the threshold model used in prior work. For the model , we consider two cases: (1) the conditional distributions of given different values of form a location-shift family, and (2) they are mixtures of generalized normal distributions with independently parameterized components. We establish identifiability of the causal direction through three theoretical results. First, we prove that under , the density ratio of conditioned on different values of is monotonic. Second, we establish that under with non-location-shift conditionals, monotonicity of the density ratio holds only on a set of Lebesgue measure zero in the parameter space. Third, we show that under , the conditional distributions forming a location-shift family requires a precise coordination between the causal mechanism and input distribution, which is non-generic under the principle of independent mechanisms. Together, these results imply that monotonicity of the density ratio characterizes the direction , whereas non-monotonicity or location-shift conditionals characterizes . Based on this, we propose Density Ratio-based Causal Discovery (DRCD), a method that determines causal direction by testing for location-shift conditionals and monotonicity of the estimated density ratio. Experiments on synthetic and real-world datasets demonstrate that DRCD outperforms existing methods.
View on arXiv