Generalizability vs. Counterfactual Explainability Trade-Off

29 May 2025

Main:9 Pages

5 Figures

Bibliography:3 Pages

2 Tables

Appendix:4 Pages

Abstract

In this work, we investigate the relationship between model generalization and counterfactual explainability in supervised learning. We introduce the notion of $\varepsilon$ -valid counterfactual probability ( $\varepsilon$ -VCP) -- the probability of finding perturbations of a data point within its $\varepsilon$ -neighborhood that result in a label change. We provide a theoretical analysis of $\varepsilon$ -VCP in relation to the geometry of the model's decision boundary, showing that $\varepsilon$ -VCP tends to increase with model overfitting. Our findings establish a rigorous connection between poor generalization and the ease of counterfactual generation, revealing an inherent trade-off between generalization and counterfactual explainability. Empirical results validate our theory, suggesting $\varepsilon$ -VCP as a practical proxy for quantitatively characterizing overfitting.

View on arXiv

@article{veglianti2025_2505.23225,
  title={ Generalizability vs. Counterfactual Explainability Trade-Off },
  author={ Fabiano Veglianti and Flavio Giorgi and Fabrizio Silvestri and Gabriele Tolomei },
  journal={arXiv preprint arXiv:2505.23225},
  year={ 2025 }
}

Comments on this paper