All Papers
Title |
|---|
Title |
|---|

Overparameterized deep networks can interpolate noisy data while at the same time showing good generalization performance. Common intuition from polynomial regression suggests that large networks are able to sharply interpolate noisy data without considerably deviating from the ground-truth signal. At present, a precise characterization of this phenomenon for deep networks is missing. In this work, we present an empirical study of input-space smoothness of the loss landscape of deep networks over volumes around cleanly- and noisily-labeled training samples, as we systematically increase the number of model parameters and training epochs. Our findings show that loss sharpness in the input space follows both model- and epoch-wise double descent, with worse peaks observed around noisy labels. While small interpolating models sharply fit both clean and noisy data, large interpolating models express a smooth loss landscape, where noisy targets are predicted over large volumes around training data points, in contrast to existing intuition.
View on arXiv