Second-order regression models exhibit progressive sharpening to the edge of stability

International Conference on Machine Learning (ICML), 2022

10 October 2022

Papers citing "Second-order regression models exhibit progressive sharpening to the edge of stability"

27 / 27 papers shown

Convergence Rates for Gradient Descent on the Edge of Stability in Overparametrised Least Squares

Lachlan Ewen MacDonald

136

20 Oct 2025

Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region

Shuang Liang

Guido Montúfar

231

29 Sep 2025

What Can Grokking Teach Us About Learning Under Nonstationarity?

144

26 Jul 2025

Variational Learning Finds Flatter Solutions at the Edge of Stability

Mohammad Emtiyaz Khan

Thomas Möllenhoff

MLT

319

15 Jun 2025

Understanding Sharpness Dynamics in NN Training with a Minimalist Example: The Effects of Dataset Difficulty, Depth, Stochasticity, and More

179

07 Jun 2025

A Minimalist Example of Edge-of-Stability and Progressive Sharpening

308

04 Mar 2025

The Optimization Landscape of SGD Across the Feature Learning StrengthInternational Conference on Learning Representations (ICLR), 2024

Alexander B. Atanasov

Alexandru Meterez

James B. Simon

Cengiz Pehlevan

430

06 Oct 2024

Stepping on the Edge: Curvature Aware Learning Rate Tuners

351

08 Jul 2024

Normalization and effective learning rates in reinforcement learning

Will Dabney

332

01 Jul 2024

Deep linear networks for regression are implicitly regularized towards flat minima

Pierre Marion

Lénaic Chizat

ODL

307

22 May 2024

High dimensional analysis reveals conservative sharpening and a stochastic edge of stability

Atish Agarwala

Jeffrey Pennington

359

30 Apr 2024

Learning Associative Memories with Gradient Descent

Vivien A. Cabannes

Berfin Simsek

A. Bietti

224

28 Feb 2024

Feature learning as alignment: a structural property of gradient descent in non-linear neural networks

283

07 Feb 2024

Neglected Hessian component explains mysteries in Sharpness regularization

401

19 Jan 2024

On the Interplay Between Stepsize Tuning and Progressive Sharpening

Vincent Roulet

Atish Agarwala

Fabian Pedregosa

272

30 Nov 2023

Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults

291

25 Nov 2023

How connectivity structure shapes rich and lazy learning in neural circuitsInternational Conference on Learning Representations (ICLR), 2023

414

12 Oct 2023

From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression

Xuxing Chen

Krishnakumar Balasubramanian

Promit Ghosal

Bhavya Agrawalla

258

02 Oct 2023

Sharpness-Aware Minimization and the Edge of StabilityJournal of machine learning research (JMLR), 2023

Philip M. Long

Peter L. Bartlett

AAML

650

21 Sep 2023

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learningInternational Conference on Machine Learning (ICML), 2023

Libin Zhu

Chaoyue Liu

Adityanarayanan Radhakrishnan

M. Belkin

422

07 Jun 2023

Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and BeyondInternational Conference on Machine Learning (ICML), 2023

225

22 May 2023

Loss Spike in Training Neural NetworksJournal of Computational Mathematics (JCM), 2023

Zhongwang Zhang

Z. Xu

214

20 May 2023

Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural NetworksNeural Information Processing Systems (NeurIPS), 2023

Blake Bordelon

Cengiz Pehlevan

MLT

327

06 Apr 2023

Phase diagram of early training dynamics in deep neural networks: effect of the learning rate, depth, and widthNeural Information Processing Systems (NeurIPS), 2023

Dayal Singh Kalra

M. Barkeshli

303

23 Feb 2023

SAM operates far from home: eigenvalue regularization as a dynamical phenomenonInternational Conference on Machine Learning (ICML), 2023

Atish Agarwala

Yann N. Dauphin

196

17 Feb 2023

Catapult Dynamics and Phase Transitions in Quadratic NetsJournal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023

David Meltzer

Min Chen

Sergii Strelchuk

335

18 Jan 2023

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of StabilityInternational Conference on Learning Representations (ICLR), 2022

Alexandru Damian

Eshaan Nichani

Jason D. Lee

294

107

30 Sep 2022