ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.06435
10
14

Phase transitions in the mini-batch size for sparse and dense two-layer neural networks

10 May 2023
Raffaele Marino
F. Ricci-Tersenghi
ArXivPDFHTML
Abstract

The use of mini-batches of data in training artificial neural networks is nowadays very common. Despite its broad usage, theories explaining quantitatively how large or small the optimal mini-batch size should be are missing. This work presents a systematic attempt at understanding the role of the mini-batch size in training two-layer neural networks. Working in the teacher-student scenario, with a sparse teacher, and focusing on tasks of different complexity, we quantify the effects of changing the mini-batch size mmm. We find that often the generalization performances of the student strongly depend on mmm and may undergo sharp phase transitions at a critical value mcm_cmc​, such that for m<mcm<m_cm<mc​ the training process fails, while for m>mcm>m_cm>mc​ the student learns perfectly or generalizes very well the teacher. Phase transitions are induced by collective phenomena firstly discovered in statistical mechanics and later observed in many fields of science. Observing a phase transition by varying the mini-batch size across different architectures raises several questions about the role of this hyperparameter in the neural network learning process.

View on arXiv
Comments on this paper