Predictive Churn with the Set of Good Models

Issues can arise when research focused on fairness, transparency, or safety is conducted separately from research driven by practical deployment concerns and vice versa. This separation creates a growing need for translational work that bridges the gap between independently studied concepts that may be fundamentally related. This paper explores connections between two seemingly unrelated concepts of predictive inconsistency that share intriguing parallels. The first, known as predictive multiplicity, occurs when models that perform similarly (e.g., nearly equivalent training loss) produce conflicting predictions for individual samples. This concept is often emphasized in algorithmic fairness research as a means of promoting transparency in ML model development. The second concept, predictive churn, examines the differences in individual predictions before and after model updates, a key challenge in deploying ML models in consumer-facing applications. We present theoretical and empirical results that uncover links between these previously disconnected concepts.
View on arXiv@article{watson-daniels2025_2402.07745, title={ Predictive Churn with the Set of Good Models }, author={ Jamelle Watson-Daniels and Flavio du Pin Calmon and Alexander DÁmour and Carol Long and David C. Parkes and Berk Ustun }, journal={arXiv preprint arXiv:2402.07745}, year={ 2025 } }