68
12

Fast learning rates with heavy-tailed losses

Abstract

We study fast learning rates when the losses are not necessarily bounded and may have a distribution with heavy tails. To enable such analyses, we introduce two new conditions: (i) the envelope function supfFf\sup_{f \in \mathcal{F}}|\ell \circ f|, where \ell is the loss function and F\mathcal{F} is the hypothesis class, exists and is LrL^r-integrable, and (ii) \ell satisfies the multi-scale Bernstein's condition on F\mathcal{F}. Under these assumptions, we prove that learning rate faster than O(n1/2)O(n^{-1/2}) can be obtained and, depending on rr and the multi-scale Bernstein's powers, can be arbitrarily close to O(n1)O(n^{-1}). We then verify these assumptions and derive fast learning rates for the problem of vector quantization by kk-means clustering with heavy-tailed distributions. The analyses enable us to obtain novel learning rates that extend and complement existing results in the literature from both theoretical and practical viewpoints.

View on arXiv
Comments on this paper