16
0

Supervised Models Can Generalize Also When Trained on Random Label

Abstract

The success of unsupervised learning raises the question of whether also supervised models can be trained without using the information in the output yy. In this paper, we demonstrate that this is indeed possible. The key step is to formulate the model as a smoother, i.e. on the form f^=Sy\hat{f}=Sy, and to construct the smoother matrix SS independently of yy, e.g. by training on random labels. We present a simple model selection criterion based on the distribution of the out-of-sample predictions and show that, in contrast to cross-validation, this criterion can be used also without access to yy. We demonstrate on real and synthetic data that yy-free trained versions of linear and kernel ridge regression, smoothing splines, and neural networks perform similarly to their standard, yy-based, versions and, most importantly, significantly better than random guessing.

View on arXiv
@article{allerbo2025_2505.11006,
  title={ Supervised Models Can Generalize Also When Trained on Random Label },
  author={ Oskar Allerbo and Thomas B. Schön },
  journal={arXiv preprint arXiv:2505.11006},
  year={ 2025 }
}
Comments on this paper