Pivotal Estimation of Nonparametric Functions via Square-root Lasso

7 May 2011

Abstract

In a nonparametric linear regression model we study a variant of LASSO, called square-root LASSO, which does not require the knowledge of the scaling parameter $\sigma$ of the noise or bounds for it. This work derives new finite sample upper bounds for prediction norm rate of convergence, $\ell_1$ -rate of converge, $\ell_\infty$ -rate of convergence, and sparsity of the square-root LASSO estimator. A lower bound for the prediction norm rate of convergence is also established. In many non-Gaussian noise cases, we rely on moderate deviation theory for self-normalized sums and on new data-dependent empirical process inequalities to achieve Gaussian-like results provided log p = o(n^{1/3}) improving upon results derived in the parametric case that required log p = O(log n). In addition, we derive finite sample bounds on the performance of ordinary least square (OLS) applied tom the model selected by square-root LASSO accounting for possible misspecification of the selected model. In particular, we provide mild conditions under which the rate of convergence of OLS post square-root LASSO is not worse than square-root LASSO. We also study two extreme cases: parametric noiseless and nonparametric unbounded variance. Square-root LASSO does have interesting theoretical guarantees for these two extreme cases. For the parametric noiseless case, differently than LASSO, square-root LASSO is capable of exact recovery. In the unbounded variance case it can still be consistent since its penalty choice does not depend on $\sigma$ . Finally, we conduct Monte carlo experiments which show that the empirical performance of square-root LASSO is very similar to the performance of LASSO when $\sigma$ is known. We also emphasize that square-root LASSO can be formulated as a convex programming problem and its computation burden is similar to LASSO.

View on arXiv

Comments on this paper