Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.15594
Cited By
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
30 September 2022
Alexandru Damian
Eshaan Nichani
Jason D. Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability"
25 / 25 papers shown
Title
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
20
0
0
05 Apr 2025
MLPs at the EOC: Dynamics of Feature Learning
Dávid Terjék
MLT
41
0
0
18 Feb 2025
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Dayal Singh Kalra
Tianyu He
M. Barkeshli
47
4
0
17 Feb 2025
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
C. Pehlevan
43
2
0
06 Oct 2024
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
Pierfrancesco Beneventano
Andrea Pinto
Tomaso A. Poggio
MLT
27
1
0
17 Jun 2024
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
56
4
1
25 May 2024
Why is SAM Robust to Label Noise?
Christina Baek
Zico Kolter
Aditi Raghunathan
NoLa
AAML
33
9
0
06 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
Atish Agarwala
Jeffrey Pennington
38
3
0
30 Apr 2024
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
30
80
0
25 Sep 2023
Sharpness-Aware Minimization and the Edge of Stability
Philip M. Long
Peter L. Bartlett
AAML
25
9
0
21 Sep 2023
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
22
6
0
25 May 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Itai Kreisler
Mor Shpigel Nacson
Daniel Soudry
Y. Carmon
23
13
0
22 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
24
16
0
19 May 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks
Blake Bordelon
C. Pehlevan
MLT
30
29
0
06 Apr 2023
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Atish Agarwala
Yann N. Dauphin
17
20
0
17 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
16
6
0
03 Feb 2023
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
31
36
0
14 Dec 2022
Second-order regression models exhibit progressive sharpening to the edge of stability
Atish Agarwala
Fabian Pedregosa
Jeffrey Pennington
25
26
0
10 Oct 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Xingyu Zhu
Zixuan Wang
Xiang Wang
Mo Zhou
Rong Ge
64
35
0
07 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima
Peter L. Bartlett
Philip M. Long
Olivier Bousquet
63
34
0
04 Oct 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
32
59
0
08 Jun 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
75
88
0
19 May 2022
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
83
98
0
13 Oct 2021
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
153
232
0
04 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
1