Catapults in SGD: spikes in the training loss and their impact on
generalization through feature learning

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

7 June 2023

Adityanarayanan Radhakrishnan

Papers citing "Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning"

11 / 11 papers shown

Title
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos Dayal Singh Kalra Tianyu He M. Barkeshli 47 4 0 17 Feb 2025
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 30 3 0 22 Sep 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 53 4 1 25 May 2024
Linear Recursive Feature Machines provably recover low-rank matrices Adityanarayanan Radhakrishnan Misha Belkin D. Drusvyatskiy 56 8 0 09 Jan 2024
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 6 4 0 24 May 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning Sanjeev Arora Zhiyuan Li A. Panigrahi MLT 72 88 0 19 May 2022
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect Yuqing Wang Minshuo Chen T. Zhao Molei Tao AI4CE 52 40 0 07 Oct 2021
Stochastic Training is Not Necessary for Generalization Jonas Geiping Micah Goldblum Phillip E. Pope Michael Moeller Tom Goldstein 81 72 0 29 Sep 2021
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 150 232 0 04 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 273 2,878 0 15 Sep 2016
Densely Connected Convolutional Networks Gao Huang Zhuang Liu L. V. D. van der Maaten Kilian Q. Weinberger PINN 3DV 244 35,884 0 25 Aug 2016