De-Biasing The Lasso With Degrees-of-Freedom Adjustment

24 February 2019

Abstract

This paper studies schemes to de-bias the Lasso in sparse linear regression where the goal is to estimate and construct confidence intervals for a low-dimensional projection of the unknown coefficient vector in a preconceived direction $a_0$ . We assume that the design matrix has iid Gaussian rows with known covariance matrix $\Sigma$ . Our analysis reveals that previous propositions to de-bias the Lasso require a modification in order to enjoy asymptotic efficiency in a full range of the level of sparsity. This modification takes the form of a degrees-of-freedom adjustment that accounts for the dimension of the model selected by the Lasso. Let $s_0$ denote the number of nonzero coefficients of the true coefficient vector. The unadjusted de-biasing schemes proposed in previous studies enjoys efficiency if $s_0\lll n^{2/3}$ , up to logarithmic factors. However, if $s_0\ggg n^{2/3}$ , the unadjusted scheme cannot be efficient in certain directions $a_0$ . In the latter regime, it it necessary to modify existing procedures by an adjustment that accounts for the degrees-of-freedom of the Lasso. The proposed degrees-of-freedom adjustment grants asymptotic efficiency for any direction $a_0$ . This holds under a Sparse Riecz Condition on the covariance matrix $\Sigma$ and the sample size requirement $s_0/p\to0$ and $s_0\log(p/s_0)/n\to0$ . Our analysis also highlights that the degrees-of-freedom adjustment is not necessary when the initial bias of the Lasso in the direction $a_0$ is small, which is granted under more stringent conditions on $\Sigma^{-1}$ . This explains why the necessity of degrees-of-freedom adjustment did not appear in some previous studies. The main proof argument involves a Gaussian interpolation path similar to that used to derive Slepian's lemma. It yields a sharp $\ell_\infty$ error bound for the Lasso under Gaussian design which is of independent interest.

View on arXiv

Comments on this paper