SEMF: Supervised Expectation-Maximization Framework for Predicting Intervals

This work introduces the Supervised Expectation-Maximization Framework (SEMF), a versatile and model-agnostic approach for generating prediction intervals with any ML model. SEMF extends the Expectation-Maximization algorithm, traditionally used in unsupervised learning, to a supervised context, leveraging latent variable modeling for uncertainty estimation. Through extensive empirical evaluation of diverse simulated distributions and 11 real-world tabular datasets, SEMF consistently produces narrower prediction intervals while maintaining the desired coverage probability, outperforming traditional quantile regression methods. Furthermore, without using the quantile (pinball) loss, SEMF allows point predictors, including gradient-boosted trees and neural networks, to be calibrated with conformal quantile regression. The results indicate that SEMF enhances uncertainty quantification under diverse data distributions and is particularly effective for models that otherwise struggle with inherent uncertainty representation.
View on arXiv@article{azizi2025_2405.18176, title={ SEMF: Supervised Expectation-Maximization Framework for Predicting Intervals }, author={ Ilia Azizi and Marc-Olivier Boldi and Valérie Chavez-Demoulin }, journal={arXiv preprint arXiv:2405.18176}, year={ 2025 } }