57
5

Fast rates for empirical risk minimization with cadlag losses with bounded sectional variation norm

Abstract

Empirical risk minimization over sieves of the class F\mathcal{F} of cadlag functions with bounded variation norm has a long history, starting with Total Variation Denoising (Rudin et al., 1992), and has been considered by several recent articles, in particular Fang et al. (2019) and van der Laan (2015). In this article, we show how a certain representation of cadlag functions with bounded sectional variation, also called Hardy-Krause variation, allows to bound the bracketing entropy of sieves of F\mathcal{F} and therefore derive fast rates of convergence in nonparametric function estimation. Specifically, for any sequence ana_n that (slowly) diverges to \infty, we show that we can construct an estimator with rate of convergence OP(2d/3n1/3(logn)d/3an2/3)O_P(2^{d/3} n^{-1/3} (\log n)^{d/3} a_n^{2/3} ) over F\mathcal{F}, under some fairly general assumptions. Remarkably, the dimension only affects the rate in nn through the logarithmic factor, making this method especially appropriate for high dimensional problems. In particular, we show that in the case of nonparametric regression over sieves of cadlag functions with bounded sectional variation norm, this upper bound on the rate of convergence holds for least-squares estimators, under the random design, sub-exponential errors setting.

View on arXiv
Comments on this paper