Sparse Linear Regression is Easy on Random Supports
Sparse linear regression is one of the most basic questions in machine learning and statistics. Here, we are given as input a design matrix and measurements or labels where , and is the noise in the measurements. Importantly, we have the additional constraint that the unknown signal vector is sparse: it has non-zero entries where is much smaller than the ambient dimension. Our goal is to output a prediction vector that has small prediction error: .Information-theoretically, we know what is best possible in terms of measurements: under most natural noise distributions, we can get prediction error at most with roughly samples. Computationally, this currently needs run-time. Alternately, with , we can get polynomial-time. Thus, there is an exponential gap (in the dependence on ) between the two and we do not know if it is possible to get run-time and samples.We give the first generic positive result for worst-case design matrices : For any , we show that if the support of is chosen at random, we can get prediction error with samples and run-time . This run-time holds for any design matrix with condition number up to .Previously, such results were known for worst-case , but only for random design matrices from well-behaved families, matrices that have a very low condition number (; e.g., as studied in compressed sensing), or those with special structural properties.
View on arXiv