196

Near-Optimal Algorithms for Omniprediction

Main:62 Pages
Bibliography:5 Pages
1 Tables
Appendix:16 Pages
Abstract

Omnipredictors are simple prediction functions that encode loss-minimizing predictions with respect to a hypothesis class H\mathcal{H}, simultaneously for every loss function within a class of losses L\mathcal{L}. In this work, we give near-optimal learning algorithms for omniprediction, in both the online and offline settings. To begin, we give an oracle-efficient online learning algorithm that acheives (L,H)(\mathcal{L},\mathcal{H})-omniprediction with O~(TlogH)\tilde{O}(\sqrt{T \log |\mathcal{H}|}) regret for any class of Lipschitz loss functions LLLip\mathcal{L} \subseteq \mathcal{L}_\mathrm{Lip}. Quite surprisingly, this regret bound matches the optimal regret for \emph{minimization of a single loss function} (up to a log(T)\sqrt{\log(T)} factor). Given this online algorithm, we develop an online-to-offline conversion that achieves near-optimal complexity across a number of measures. In particular, for all bounded loss functions within the class of Bounded Variation losses LBV\mathcal{L}_\mathrm{BV} (which include all convex, all Lipschitz, and all proper losses) and any (possibly-infinite) H\mathcal{H}, we obtain an offline learning algorithm that, leveraging an (offline) ERM oracle and mm samples from D\mathcal{D}, returns an efficient (LBV,H,ε(m))(\mathcal{L}_{\mathrm{BV}},\mathcal{H},\varepsilon(m))-omnipredictor for ε(m)\varepsilon(m) scaling near-linearly in the Rademacher complexity of ThH\mathrm{Th} \circ \mathcal{H}.

View on arXiv
Comments on this paper