ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.04860
  4. Cited By
Second-order regression models exhibit progressive sharpening to the
  edge of stability

Second-order regression models exhibit progressive sharpening to the edge of stability

International Conference on Machine Learning (ICML), 2022
10 October 2022
Atish Agarwala
Fabian Pedregosa
Jeffrey Pennington
ArXiv (abs)PDFHTML

Papers citing "Second-order regression models exhibit progressive sharpening to the edge of stability"

27 / 27 papers shown
Convergence Rates for Gradient Descent on the Edge of Stability in Overparametrised Least Squares
Convergence Rates for Gradient Descent on the Edge of Stability in Overparametrised Least Squares
Lachlan Ewen MacDonald
Hancheng Min
Leandro Palma
Salma Tarmoun
Ziqing Xu
Rene Vidal
MLT
136
0
0
20 Oct 2025
Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Shuang Liang
Guido Montúfar
231
0
0
29 Sep 2025
What Can Grokking Teach Us About Learning Under Nonstationarity?
What Can Grokking Teach Us About Learning Under Nonstationarity?
Clare Lyle
Gharda Sokar
Razvan Pascanu
András Gyorgy
144
3
0
26 Jul 2025
Variational Learning Finds Flatter Solutions at the Edge of Stability
Variational Learning Finds Flatter Solutions at the Edge of Stability
Avrajit Ghosh
Bai Cong
Rio Yokota
S. Ravishankar
Rongrong Wang
Molei Tao
Mohammad Emtiyaz Khan
Thomas Möllenhoff
MLT
319
1
0
15 Jun 2025
Understanding Sharpness Dynamics in NN Training with a Minimalist Example: The Effects of Dataset Difficulty, Depth, Stochasticity, and More
Understanding Sharpness Dynamics in NN Training with a Minimalist Example: The Effects of Dataset Difficulty, Depth, Stochasticity, and More
Geonhui Yoo
Minhak Song
Chulhee Yun
FAtt
179
1
0
07 Jun 2025
A Minimalist Example of Edge-of-Stability and Progressive Sharpening
Liming Liu
Zixuan Zhang
S. Du
T. Zhao
308
1
0
04 Mar 2025
The Optimization Landscape of SGD Across the Feature Learning Strength
The Optimization Landscape of SGD Across the Feature Learning StrengthInternational Conference on Learning Representations (ICLR), 2024
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
Cengiz Pehlevan
430
10
0
06 Oct 2024
Stepping on the Edge: Curvature Aware Learning Rate Tuners
Stepping on the Edge: Curvature Aware Learning Rate Tuners
Vincent Roulet
Atish Agarwala
Jean-Bastien Grill
Grzegorz Swirszcz
Mathieu Blondel
Fabian Pedregosa
351
4
0
08 Jul 2024
Normalization and effective learning rates in reinforcement learning
Normalization and effective learning rates in reinforcement learning
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
James Martens
H. V. Hasselt
Razvan Pascanu
Will Dabney
332
29
0
01 Jul 2024
Deep linear networks for regression are implicitly regularized towards
  flat minima
Deep linear networks for regression are implicitly regularized towards flat minima
Pierre Marion
Lénaic Chizat
ODL
307
13
0
22 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
Atish Agarwala
Jeffrey Pennington
359
9
0
30 Apr 2024
Learning Associative Memories with Gradient Descent
Learning Associative Memories with Gradient Descent
Vivien A. Cabannes
Berfin Simsek
A. Bietti
224
13
0
28 Feb 2024
Feature learning as alignment: a structural property of gradient descent
  in non-linear neural networks
Feature learning as alignment: a structural property of gradient descent in non-linear neural networks
Daniel Beaglehole
Ioannis Mitliagkas
Atish Agarwala
MLT
283
4
0
07 Feb 2024
Neglected Hessian component explains mysteries in Sharpness
  regularization
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
401
12
0
19 Jan 2024
On the Interplay Between Stepsize Tuning and Progressive Sharpening
On the Interplay Between Stepsize Tuning and Progressive Sharpening
Vincent Roulet
Atish Agarwala
Fabian Pedregosa
272
5
0
30 Nov 2023
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large
  Catapults
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn
Junghyun Lee
Bohan Wang
Huishuai Zhang
Chulhee Yun
291
1
0
25 Nov 2023
How connectivity structure shapes rich and lazy learning in neural
  circuits
How connectivity structure shapes rich and lazy learning in neural circuitsInternational Conference on Learning Representations (ICLR), 2023
Yuhan Helena Liu
A. Baratin
Jonathan H. Cornford
Stefan Mihalas
E. Shea-Brown
Guillaume Lajoie
414
22
0
12 Oct 2023
From Stability to Chaos: Analyzing Gradient Descent Dynamics in
  Quadratic Regression
From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression
Xuxing Chen
Krishnakumar Balasubramanian
Promit Ghosal
Bhavya Agrawalla
258
10
0
02 Oct 2023
Sharpness-Aware Minimization and the Edge of Stability
Sharpness-Aware Minimization and the Edge of StabilityJournal of machine learning research (JMLR), 2023
Philip M. Long
Peter L. Bartlett
AAML
650
15
0
21 Sep 2023
Catapults in SGD: spikes in the training loss and their impact on
  generalization through feature learning
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learningInternational Conference on Machine Learning (ICML), 2023
Libin Zhu
Chaoyue Liu
Adityanarayanan Radhakrishnan
M. Belkin
422
25
0
07 Jun 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow
  Solutions in Scalar Networks and Beyond
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and BeyondInternational Conference on Machine Learning (ICML), 2023
Itai Kreisler
Mor Shpigel Nacson
Daniel Soudry
Y. Carmon
225
16
0
22 May 2023
Loss Spike in Training Neural Networks
Loss Spike in Training Neural NetworksJournal of Computational Mathematics (JCM), 2023
Zhongwang Zhang
Z. Xu
214
13
0
20 May 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean
  Field Neural Networks
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural NetworksNeural Information Processing Systems (NeurIPS), 2023
Blake Bordelon
Cengiz Pehlevan
MLT
327
39
0
06 Apr 2023
Phase diagram of early training dynamics in deep neural networks: effect
  of the learning rate, depth, and width
Phase diagram of early training dynamics in deep neural networks: effect of the learning rate, depth, and widthNeural Information Processing Systems (NeurIPS), 2023
Dayal Singh Kalra
M. Barkeshli
303
15
0
23 Feb 2023
SAM operates far from home: eigenvalue regularization as a dynamical
  phenomenon
SAM operates far from home: eigenvalue regularization as a dynamical phenomenonInternational Conference on Machine Learning (ICML), 2023
Atish Agarwala
Yann N. Dauphin
196
24
0
17 Feb 2023
Catapult Dynamics and Phase Transitions in Quadratic Nets
Catapult Dynamics and Phase Transitions in Quadratic NetsJournal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023
David Meltzer
Min Chen
Sergii Strelchuk
335
10
0
18 Jan 2023
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of
  Stability
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of StabilityInternational Conference on Learning Representations (ICLR), 2022
Alexandru Damian
Eshaan Nichani
Jason D. Lee
294
107
0
30 Sep 2022
1
Page 1 of 1