v1v2 (latest)

Regularization with the Smooth-Lasso procedure

5 March 2008

Abstract

We consider the linear regression problem. We propose the S-Lasso procedure to estimate the unknown regression parameters. This estimator enjoys sparsity of the representation while taking into account correlation between successive covariates (or predictors). The study covers the case when $p\gg n$ , i.e. the number of covariates is much larger than the number of observations. In the theoretical point of view, for fixed $p$ , we establish asymptotic normality and consistency in variable selection results for our procedure. When $p\geq n$ , we provide variable selection consistency results and show that the S-Lasso achieved a Sparsity Inequality, i.e., a bound in term of the number of non-zero components of the oracle vector. It appears that the S-Lasso has nice variable selection properties compared to its challengers. Furthermore, we provide an estimator of the effective degree of freedom of the S-Lasso estimator. A simulation study shows that the S-Lasso performs better than the Lasso as far as variable selection is concerned especially when high correlations between successive covariates exist. This procedure also appears to be a good challenger to the Elastic-Net (Zou and Hastie, 2005).

View on arXiv

Comments on this paper