v1v2 (latest)

Joint Distribution-Informed Shapley Values for Sparse Counterfactual Explanations

7 October 2024

Lei You

Yijun Bian

Lele Cao

ArXiv (abs)PDF HTML Github

Main:10 Pages

10 Figures

5 Tables

Appendix:13 Pages

Abstract

Counterfactual explanations (CE) aim to reveal how small input changes flip a model's prediction, yet many methods modify more features than necessary, reducing clarity and actionability. We introduce \emph{COLA}, a model- and generator-agnostic post-hoc framework that refines any given CE by computing a coupling via optimal transport (OT) between factual and counterfactual sets and using it to drive a Shapley-based attribution (\emph{ $p$ -SHAP}) that selects a minimal set of edits while preserving the target effect. Theoretically, OT minimizes an upper bound on the $W_1$ divergence between factual and counterfactual outcomes and that, under mild conditions, refined counterfactuals are guaranteed not to move farther from the factuals than the originals. Empirically, across four datasets, twelve models, and five CE generators, COLA achieves the same target effects with only 26--45\% of the original feature edits. On a small-scale benchmark, COLA shows near-optimality.

View on arXiv

Comments on this paper