v1v2v3v4 (latest)

Proofs as Explanations: Short Certificates for Reliable Predictions

Annual Conference Computational Learning Theory (COLT), 2025

11 April 2025

ArXiv (abs)PDF HTML Github

Main:12 Pages

Bibliography:4 Pages

Appendix:4 Pages

Abstract

We consider a model for explainable AI in which an explanation for a prediction $h(x)=y$ consists of a subset $S'$ of the training data (if it exists) such that all classifiers $h' \in H$ that make at most $b$ mistakes on $S'$ predict $h'(x)=y$ . Such a set $S'$ serves as a proof that $x$ indeed has label $y$ under the assumption that (1) the target function $h^\star$ belongs to $H$ , and (2) the set $S$ contains at most $b$ corrupted points. For example, if $b=0$ and $H$ is the family of linear classifiers in $\mathbb{R}^d$ , and if $x$ lies inside the convex hull of the positive data points in $S$ (and hence every consistent linear classifier labels $x$ as positive), then Carathéodory's theorem states that $x$ lies inside the convex hull of $d+1$ of those points. So, a set $S'$ of size $d+1$ could be released as an explanation for a positive prediction, and would serve as a short proof of correctness of the prediction under the assumption of realizability.In this work, we consider this problem more generally, for general hypothesis classes $H$ and general values $b\geq 0$ . We define the notion of the robust hollow star number of $H$ (which generalizes the standard hollow star number), and show that it precisely characterizes the worst-case size of the smallest certificate achievable, and analyze its size for natural classes. We also consider worst-case distributional bounds on certificate size, as well as distribution-dependent bounds that we show tightly control the sample size needed to get a certificate for any given test example. In particular, we define a notion of the certificate coefficient $\varepsilon_x$ of an example $x$ with respect to a data distribution $D$ and target function $h^\star$ , and prove matching upper and lower bounds on sample size as a function of $\varepsilon_x$ , $b$ , and the VC dimension $d$ of $H$ .

View on arXiv

Comments on this paper