Risk-consistency of cross-validation with lasso-type procedures

4 August 2013

Abstract

The lasso and related procedures such as the group lasso, have been the target of a substantial amount of theoretical and applied research. Correspondingly, many results are known about their behavior for a fixed or optimally chosen tuning parameter specified up to unknown constants. In practice however, this oracle tuning parameter is inaccessible, so one must instead use the data to choose a tuning parameter. Common statistical practice is to use one of a few variants of cross-validation for this task. However, very little is known about the theoretical properties of the resulting linear model using these data-dependent methods. We consider the high-dimensional setting with random design wherein the number of predictors $p=n^\alpha,\ \alpha>0$ grows with the number of observations. We show that the lasso and group lasso estimators remain risk consistent relative to their linear oracles even when the tuning parameter is chosen via cross-validation rather than optimally.

View on arXiv

Comments on this paper