61
28

De-Biasing The Lasso With Degrees-of-Freedom Adjustment

Abstract

This paper studies schemes to de-bias the Lasso in sparse linear regression where the goal is to estimate and construct confidence intervals for a low-dimensional projection of the unknown coefficient vector in a preconceived direction a0a_0. We assume that the design matrix has iid Gaussian rows with known covariance matrix Σ\Sigma. Our analysis reveals that previous propositions to de-bias the Lasso require a modification in order to enjoy asymptotic efficiency in a full range of the level of sparsity. This modification takes the form of a degrees-of-freedom adjustment that accounts for the dimension of the model selected by the Lasso. Let s0s_0 denote the number of nonzero coefficients of the true coefficient vector. The unadjusted de-biasing schemes proposed in previous studies enjoys efficiency if s0n2/3s_0\lll n^{2/3}, up to logarithmic factors. However, if s0n2/3s_0\ggg n^{2/3}, the unadjusted scheme cannot be efficient in certain directions a0a_0. In the latter regime, it it necessary to modify existing procedures by an adjustment that accounts for the degrees-of-freedom of the Lasso. The proposed degrees-of-freedom adjustment grants asymptotic efficiency for any direction a0a_0. This holds under a Sparse Riecz Condition on the covariance matrix Σ\Sigma and the sample size requirement s0/p0s_0/p\to0 and s0log(p/s0)/n0s_0\log(p/s_0)/n\to0. Our analysis also highlights that the degrees-of-freedom adjustment is not necessary when the initial bias of the Lasso in the direction a0a_0 is small, which is granted under more stringent conditions on Σ1\Sigma^{-1}. This explains why the necessity of degrees-of-freedom adjustment did not appear in some previous studies. The main proof argument involves a Gaussian interpolation path similar to that used to derive Slepian's lemma. It yields a sharp \ell_\infty error bound for the Lasso under Gaussian design which is of independent interest.

View on arXiv
Comments on this paper