296

Overfitting Can Be Harmless for Basis Pursuit: Only to a Degree

Abstract

Recently, there have been significant interests in studying the generalization power of linear regression models in the overparameterized regime, with the hope that such analysis may provide the first step towards understanding why overparameterized deep neural networks generalize well even when they overfit the training data. Studies on min 2\ell_2-norm solutions that overfit the training data have suggested that such solutions exhibit the "double-descent" behavior, i.e., the test error decreases with the number of features pp in the overparameterized regime when pp is larger than the number of samples nn. However, for linear models with i.i.d. Gaussian features, for large pp the model errors of such min 2\ell_2-norm solutions approach the "null risk," i.e., the error of a trivial estimator that always outputs zero, even when the noise is very low. In contrast, we studied the overfitting solution of min 1\ell_1-norm, which is known as Basis Pursuit (BP) in the compressed sensing literature. Under a sparse true linear model with i.i.d. Gaussian features, we show that for a large range of pp up to a limit that grows exponentially with nn, with high probability the model error of BP is upper bounded by a value that decreases with pp and is proportional to the noise level. To the best of our knowledge, this is the first result in the literature showing that, without any explicit regularization in such settings where both pp and the dimension of data are much larger than nn, the test errors of a practical-to-compute overfitting solution can exhibit double-descent and approach the order of the noise level independently of the null risk. Our upper bound also reveals a descent floor for BP that is proportional to the noise level. Further, this descent floor is independent of nn and the null risk, but increases with the sparsity level of the true model.

View on arXiv
Comments on this paper