We study the generalization capability of nearly-interpolating linear regressors: 's whose training error is positive but small, i.e., below the noise floor. Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix , we demonstrate that any near-interpolator exhibits rapid norm growth: for fixed, has squared -norm where is the number of samples and is the exponent of the eigendecay, i.e., . This implies that existing data-independent norm-based bounds are necessarily loose. On the other hand, in the same regime we precisely characterize the asymptotic trade-off between interpolation and generalization. Our characterization reveals that larger norm scaling exponents correspond to worse trade-offs between interpolation and generalization. We verify empirically that a similar phenomenon holds for nearly-interpolating shallow neural networks.
View on arXiv