12
79

On cross-validated Lasso in high dimensions

Abstract

In this paper, we derive non-asymptotic error bounds for the Lasso estimator when the penalty parameter for the estimator is chosen using KK-fold cross-validation. Our bounds imply that the cross-validated Lasso estimator has nearly optimal rates of convergence in the prediction, L2L^2, and L1L^1 norms. For example, we show that in the model with the Gaussian noise and under fairly general assumptions on the candidate set of values of the penalty parameter, the estimation error of the cross-validated Lasso estimator converges to zero in the prediction norm with the slogp/n×log(pn)\sqrt{s\log p / n}\times \sqrt{\log(p n)} rate, where nn is the sample size of available data, pp is the number of covariates, and ss is the number of non-zero coefficients in the model. Thus, the cross-validated Lasso estimator achieves the fastest possible rate of convergence in the prediction norm up to a small logarithmic factor log(pn)\sqrt{\log(p n)}, and similar conclusions apply for the convergence rate both in L2L^2 and in L1L^1 norms. Importantly, our results cover the case when pp is (potentially much) larger than nn and also allow for the case of non-Gaussian noise. Our paper therefore serves as a justification for the widely spread practice of using cross-validation as a method to choose the penalty parameter for the Lasso estimator.

View on arXiv
Comments on this paper