Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

31 January 2023

Papers citing "Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning"

6 / 6 papers shown

Title
The Optimization Landscape of SGD Across the Feature Learning Strength Alexander B. Atanasov Alexandru Meterez James B. Simon C. Pehlevan 43 2 0 06 Oct 2024
Weight fluctuations in (deep) linear neural networks and a derivation of the inverse-variance flatness relation Markus Gross A. Raulf Christoph Räth 19 0 0 23 Nov 2023
Enhancing Deep Neural Network Training Efficiency and Performance through Linear Prediction Hejie Ying Mengmeng Song Yaohong Tang S. Xiao Zimin Xiao 21 8 0 17 Oct 2023
On the different regimes of Stochastic Gradient Descent Antonio Sclocchi M. Wyart 14 17 0 19 Sep 2023
Geometric compression of invariant manifolds in neural nets J. Paccolat Leonardo Petrini Mario Geiger Kevin Tyloo M. Wyart MLT 42 34 0 22 Jul 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 273 2,878 0 15 Sep 2016