We derive expressions for the finite-sample distribution of the Lasso estimator in the context of a linear regression model with normally distributed errors in low as well as in high dimensions by exploiting the structure of the optimization problem defining the estimator. In low dimensions we assume full rank of the regressor matrix and present expressions for the cumulative distribution function as well as the densities of the absolutely continuous parts of the estimator. Additionally, we establish an explicit formula for the correspondence between the Lasso and the least-squares estimator. We derive analogous results for the distribution in less explicit form in high dimensions where we make no assumptions on the regressor matrix at all. In this setting, we also investigate the model selection properties of the Lasso and illustrate that which models may potentially be selected by the estimator might be completely independent of the observed response vector.
View on arXiv