Understanding Gradient Descent on Edge of Stability in Deep Learning

Understanding Gradient Descent on Edge of Stability in Deep Learning

19 May 2022

Papers citing "Understanding Gradient Descent on Edge of Stability in Deep Learning"

18 / 18 papers shown

Title
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos Dayal Singh Kalra Tianyu He M. Barkeshli 47 4 0 17 Feb 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training Zhanpeng Zhou Mingze Wang Yuchen Mao Bingrui Li Junchi Yan AAML 57 0 0 14 Oct 2024
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm Jinwei Zhao Marco Gori Alessandro Betti S. Melacci Hongtao Zhang Jiedong Liu Xinhong Hei 23 0 0 10 Sep 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 56 4 1 25 May 2024
Sharpness-Aware Minimization and the Edge of Stability Philip M. Long Peter L. Bartlett AAML 25 9 0 21 Sep 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization Kaiyue Wen Zhiyuan Li Tengyu Ma FAtt 22 26 0 20 Jul 2023
How to escape sharp minima with random perturbations Kwangjun Ahn Ali Jadbabaie S. Sra ODL 22 6 0 25 May 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond Itai Kreisler Mor Shpigel Nacson Daniel Soudry Y. Carmon 23 13 0 22 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability Jingfeng Wu Vladimir Braverman Jason D. Lee 24 16 0 19 May 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization Kayhan Behdin Qingquan Song Aman Gupta S. Keerthi Ayan Acharya Borja Ocejo Gregory Dexter Rajiv Khanna D. Durfee Rahul Mazumder AAML 13 7 0 19 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 16 6 0 03 Feb 2023
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 31 36 0 14 Dec 2022
How Does Sharpness-Aware Minimization Minimize Sharpness? Kaiyue Wen Tengyu Ma Zhiyuan Li AAML 21 47 0 10 Nov 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example Xingyu Zhu Zixuan Wang Xiang Wang Mo Zhou Rong Ge 62 35 0 07 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima Peter L. Bartlett Philip M. Long Olivier Bousquet 63 34 0 04 Oct 2022
On the Implicit Bias in Deep-Learning Algorithms Gal Vardi FedML AI4CE 22 72 0 26 Aug 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling Gerard Ben Arous Reza Gheissari Aukosh Jagannath 32 59 0 08 Jun 2022
Understanding the unstable convergence of gradient descent Kwangjun Ahn J. Zhang S. Sra 16 56 0 03 Apr 2022