142
23

Risk-consistency of cross-validation with lasso-type procedures

Abstract

The lasso and related sparsity inducing algorithms have been the target of substantial theoretical and applied research. Correspondingly, many results are known about their behavior for a fixed or optimally chosen tuning parameter specified up to unknown constants. In practice, however, this oracle tuning parameter is inaccessible, so one must instead use the data to choose a tuning parameter. Common statistical practice is to use one of a few variants of cross-validation for this task. However, very little is known about the theoretical properties of the resulting predictions using data-dependent methods. We consider the high-dimensional setting with random design wherein the number of predictors pp grows with the number of observations nn. We show that the lasso remains risk consistent relative to its linear oracle even when the tuning parameter is chosen via cross-validation and the true model is not necessarily linear. We generalize these results to the group lasso and \mboxlasso\sqrt{\mbox{lasso}} and compare the performance of cross-validation to other tuning parameter selection methods via simulations.

View on arXiv
Comments on this paper