51
1

Multiple Testing and Variable Selection along Least Angle Regression's path

Abstract

In this article we investigate Least Angle Regression (LAR) algorithm in high dimensions under the Gaussian noise assumption. For the first time, we give the exact joint law of sequence of knots conditional on the sequence of variables entering the model. Numerical experiments are provided to demonstrate the perfect fit of our finding. Based on this result, we prove an exact control of the existence of false negatives in the general design case and an exact control of the False Discovery Rate (FDR) in the orthogonal design case. Our contribution is two fold. First, we build testing procedures on variables entering the model along the LAR's path and we introduce a new exact testing procedure on the existence of false negatives in the general design case when the noise level can be unknown. This testing procedures are referred to as the Generalized t-Spacing Test (GtST). Second, we give an exact control of the FDR in the orthogonal design case. Monte-Carlo simulations and a real data experiment are provided to illustrate our results.

View on arXiv
Comments on this paper