Optimal transport and Wasserstein distances for causal models

In this paper, we introduce a variant of optimal transport adapted to the causal structure given by an underlying directed graph . Different graph structures lead to different specifications of the optimal transport problem. For instance, a fully connected graph yields standard optimal transport, a linear graph structure corresponds to causal optimal transport between the distributions of two discrete-time stochastic processes, and an empty graph leads to a notion of optimal transport related to CO-OT, Gromov-Wasserstein distances and factored OT. We derive different characterizations of -causal transport plans and introduce Wasserstein distances between causal models that respect the underlying graph structure. We show that average treatment effects are continuous with respect to -causal Wasserstein distances and small perturbations of structural causal models lead to small deviations in -causal Wasserstein distance. We also introduce an interpolation between causal models based on -causal Wasserstein distance and compare it to standard Wasserstein interpolation.
View on arXiv