In Compressed Sensing and high dimensional estimation, signal recovery often relies on sparsity assumptions and estimation is performed via -penalized least-squares optimization, a.k.a. LASSO. The penalisation is usually controlled by a weight, also called "relaxation parameter", denoted by . It is commonly thought that the practical efficiency of the LASSO for prediction crucially relies on accurate selection of . In this short note, we propose to consider the hyper-parameter selection problem from a new perspective which combines the Hedge online learning method by Freund and Shapire, with the stochastic Frank-Wolfe method for the LASSO. Using the Hedge algorithm, we show that a our simple selection rule can achieve prediction results comparable to Cross Validation at a potentially much lower computational cost.
View on arXiv