164
v1v2 (latest)

Omnipredicting Single-Index Models with Multi-Index Models

Symposium on the Theory of Computing (STOC), 2024
Main:48 Pages
1 Figures
Bibliography:4 Pages
Appendix:10 Pages
Abstract

Recent work on supervised learning [GKR+22] defined the notion of omnipredictors, i.e., predictor functions pp over features that are simultaneously competitive for minimizing a family of loss functions L\mathcal{L} against a comparator class C\mathcal{C}. Omniprediction requires approximating the Bayes-optimal predictor beyond the loss minimization paradigm, and has generated significant interest in the learning theory community. However, even for basic settings such as agnostically learning single-index models (SIMs), existing omnipredictor constructions require impractically-large sample complexities and runtimes, and output complex, highly-improper hypotheses.Our main contribution is a new, simple construction of omnipredictors for SIMs. We give a learner outputting an omnipredictor that is ε\varepsilon-competitive on any matching loss induced by a monotone, Lipschitz link function, when the comparator class is bounded linear predictors. Our algorithm requires ε4\approx \varepsilon^{-4} samples and runs in nearly-linear time, and its sample complexity improves to ε2\approx \varepsilon^{-2} if link functions are bi-Lipschitz. This significantly improves upon the only prior known construction, due to [HJKRR18, GHK+23], which used ε10\gtrsim \varepsilon^{-10} samples.We achieve our construction via a new, sharp analysis of the classical Isotron algorithm [KS09, KKKS11] in the challenging agnostic learning setting, of potential independent interest. Previously, Isotron was known to properly learn SIMs in the realizable setting, as well as constant-factor competitive hypotheses under the squared loss [ZWDD24]. As they are based on Isotron, our omnipredictors are multi-index models with ε2\approx \varepsilon^{-2} prediction heads, bringing us closer to the tantalizing goal of proper omniprediction for general loss families and comparators.

View on arXiv
Comments on this paper