20
10

On the robustness of the minimum 2\ell_2 interpolator

Abstract

We analyse the interpolator with minimal 2\ell_2-norm β^\hat{\beta} in a general high dimensional linear regression framework where Y=Xβ+ξ\mathbb Y=\mathbb X\beta^*+\xi where X\mathbb X is a random n×pn\times p matrix with independent N(0,Σ)\mathcal N(0,\Sigma) rows and without assumption on the noise vector ξRn\xi\in \mathbb R^n. We prove that, with high probability, the prediction loss of this estimator is bounded from above by (β22rcn(Σ)ξ2)/n(\|\beta^*\|^2_2r_{cn}(\Sigma)\vee \|\xi\|^2)/n, where rk(Σ)=ikλi(Σ)r_{k}(\Sigma)=\sum_{i\geq k}\lambda_i(\Sigma) are the rests of the sum of eigenvalues of Σ\Sigma. These bounds show a transition in the rates. For high signal to noise ratios, the rates β22rcn(Σ)/n\|\beta^*\|^2_2r_{cn}(\Sigma)/n broadly improve the existing ones. For low signal to noise ratio, we also provide lower bound holding with large probability. Under assumptions on the sprectrum of Σ\Sigma, this lower bound is of order ξ22/n\| \xi\|_2^2/n, matching the upper bound. Consequently, in the large noise regime, we are able to precisely track the prediction error with large probability. This results give new insight when the interpolation can be harmless in high dimensions.

View on arXiv
Comments on this paper