ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.00629
16
0
v1v2 (latest)

Generalization performance of narrow one-hidden layer networks in the teacher-student setting

1 July 2025
Jean Barbier
Federica Gerace
Alessandro Ingrosso
Clarissa Lauditi
Enrico M. Malatesta
Gibbs Nwemadji
Rodrigo Pérez Ortiz
ArXiv (abs)PDFHTML
Main:10 Pages
7 Figures
Bibliography:4 Pages
Appendix:20 Pages
Abstract

Understanding the generalization abilities of neural networks for simple input-output distributions is crucial to account for their learning performance on real datasets. The classical teacher-student setting, where a network is trained from data obtained thanks to a label-generating teacher model, serves as a perfect theoretical test bed. In this context, a complete theoretical account of the performance of fully connected one-hidden layer networks in the presence of generic activation functions is lacking. In this work, we develop such a general theory for narrow networks, i.e. networks with a large number of hidden units, yet much smaller than the input dimension. Using methods from statistical physics, we provide closed-form expressions for the typical performance of both finite temperature (Bayesian) and empirical risk minimization estimators, in terms of a small number of weight statistics. In doing so, we highlight the presence of a transition where hidden neurons specialize when the number of samples is sufficiently large and proportional to the number of parameters of the network. Our theory accurately predicts the generalization error of neural networks trained on regression or classification tasks with either noisy full-batch gradient descent (Langevin dynamics) or full-batch gradient descent.

View on arXiv
@article{barbier2025_2507.00629,
  title={ Generalization performance of narrow one-hidden layer networks in the teacher-student setting },
  author={ Jean Barbier and Federica Gerace and Alessandro Ingrosso and Clarissa Lauditi and Enrico M. Malatesta and Gibbs Nwemadji and Rodrigo Pérez Ortiz },
  journal={arXiv preprint arXiv:2507.00629},
  year={ 2025 }
}
Comments on this paper