Understanding Gradient Descent on Edge of Stability in Deep Learning

Understanding Gradient Descent on Edge of Stability in Deep Learning

19 May 2022

Papers citing "Understanding Gradient Descent on Edge of Stability in Deep Learning"

16 / 16 papers shown

Title
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos Dayal Singh Kalra Tianyu He M. Barkeshli 47 4 0 17 Feb 2025
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm Jinwei Zhao Marco Gori Alessandro Betti S. Melacci Hongtao Zhang Jiedong Liu Xinhong Hei 18 0 0 10 Sep 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 47 4 1 25 May 2024
Sharpness-Aware Minimization and the Edge of Stability Philip M. Long Peter L. Bartlett AAML 22 9 0 21 Sep 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization Kaiyue Wen Zhiyuan Li Tengyu Ma FAtt 19 26 0 20 Jul 2023
How to escape sharp minima with random perturbations Kwangjun Ahn Ali Jadbabaie S. Sra ODL 19 6 0 25 May 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond Itai Kreisler Mor Shpigel Nacson Daniel Soudry Y. Carmon 21 13 0 22 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability Jingfeng Wu Vladimir Braverman Jason D. Lee 24 16 0 19 May 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization Kayhan Behdin Qingquan Song Aman Gupta S. Keerthi Ayan Acharya Borja Ocejo Gregory Dexter Rajiv Khanna D. Durfee Rahul Mazumder AAML 13 7 0 19 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 11 6 0 03 Feb 2023
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 28 36 0 14 Dec 2022
How Does Sharpness-Aware Minimization Minimize Sharpness? Kaiyue Wen Tengyu Ma Zhiyuan Li AAML 18 47 0 10 Nov 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example Xingyu Zhu Zixuan Wang Xiang Wang Mo Zhou Rong Ge 59 35 0 07 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima Peter L. Bartlett Philip M. Long Olivier Bousquet 60 34 0 04 Oct 2022
On the Implicit Bias in Deep-Learning Algorithms Gal Vardi FedML AI4CE 19 72 0 26 Aug 2022
Understanding the unstable convergence of gradient descent Kwangjun Ahn J. Zhang S. Sra 16 56 0 03 Apr 2022