Regularization and the small-ball method II: complexity dependent error rates

For a convex class of functions , a regularization functions and given the random data , we study estimation properties of regularization procedures of the form \begin{equation*} \hat f \in {\rm argmin}_{f\in F}\Big(\frac{1}{N}\sum_{i=1}^N\big(Y_i-f(X_i)\big)^2+\lambda \Psi(f)\Big) \end{equation*} for some well chosen regularization parameter . We obtain bounds on the estimation error rate that depend on the complexity of the "true model" , where and the 's are independent and distributed as . Our estimate holds under weak stochastic assumptions -- one of which being a small-ball condition satisfied by -- and for rather flexible choices of regularization functions . Moreover, the result holds in the learning theory framework: we do not assume any a-priori connection between the output and the input . As a proof of concept, we apply our general estimation bound to various choices of , for example, the and -norms (for ), weak-, atomic norms, max-norm and SLOPE. In many cases, the estimation rate almost coincides with the minimax rate in the class .
View on arXiv