Second-order regression models exhibit progressive sharpening to the edge of stability

10 October 2022

Papers citing "Second-order regression models exhibit progressive sharpening to the edge of stability"

24 / 24 papers shown

Title
A Minimalist Example of Edge-of-Stability and Progressive Sharpening Liming Liu Zixuan Zhang S. Du T. Zhao 67 0 0 04 Mar 2025
The Optimization Landscape of SGD Across the Feature Learning Strength Alexander B. Atanasov Alexandru Meterez James B. Simon C. Pehlevan 43 2 0 06 Oct 2024
Stepping on the Edge: Curvature Aware Learning Rate Tuners Vincent Roulet Atish Agarwala Jean-Bastien Grill Grzegorz Swirszcz Mathieu Blondel Fabian Pedregosa 26 1 0 08 Jul 2024
Normalization and effective learning rates in reinforcement learning Clare Lyle Zeyu Zheng Khimya Khetarpal James Martens H. V. Hasselt Razvan Pascanu Will Dabney 19 7 0 01 Jul 2024
Deep linear networks for regression are implicitly regularized towards flat minima Pierre Marion Lénaic Chizat ODL 16 5 0 22 May 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability Atish Agarwala Jeffrey Pennington 35 3 0 30 Apr 2024
Learning Associative Memories with Gradient Descent Vivien A. Cabannes Berfin Simsek A. Bietti 30 3 0 28 Feb 2024
Feature learning as alignment: a structural property of gradient descent in non-linear neural networks Daniel Beaglehole Ioannis Mitliagkas Atish Agarwala MLT 34 2 0 07 Feb 2024
Neglected Hessian component explains mysteries in Sharpness regularization Yann N. Dauphin Atish Agarwala Hossein Mobahi FAtt 23 7 0 19 Jan 2024
On the Interplay Between Stepsize Tuning and Progressive Sharpening Vincent Roulet Atish Agarwala Fabian Pedregosa 13 4 0 30 Nov 2023
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults Prin Phunyaphibarn Junghyun Lee Bohan Wang Huishuai Zhang Chulhee Yun 13 0 0 25 Nov 2023
How connectivity structure shapes rich and lazy learning in neural circuits Yuhan Helena Liu A. Baratin Jonathan H. Cornford Stefan Mihalas E. Shea-Brown Guillaume Lajoie 30 14 0 12 Oct 2023
From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression Xuxing Chen Krishnakumar Balasubramanian Promit Ghosal Bhavya Agrawalla 22 7 0 02 Oct 2023
Sharpness-Aware Minimization and the Edge of Stability Philip M. Long Peter L. Bartlett AAML 25 9 0 21 Sep 2023
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning Libin Zhu Chaoyue Liu Adityanarayanan Radhakrishnan M. Belkin 19 13 0 07 Jun 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond Itai Kreisler Mor Shpigel Nacson Daniel Soudry Y. Carmon 21 13 0 22 May 2023
Loss Spike in Training Neural Networks Zhongwang Zhang Z. Xu 15 4 0 20 May 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks Blake Bordelon C. Pehlevan MLT 24 29 0 06 Apr 2023
Phase diagram of early training dynamics in deep neural networks: effect of the learning rate, depth, and width Dayal Singh Kalra M. Barkeshli 11 9 0 23 Feb 2023
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon Atish Agarwala Yann N. Dauphin 17 20 0 17 Feb 2023
Catapult Dynamics and Phase Transitions in Quadratic Nets David Meltzer Junyu Liu 15 9 0 18 Jan 2023
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example Xingyu Zhu Zixuan Wang Xiang Wang Mo Zhou Rong Ge 62 35 0 07 Oct 2022
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability Alexandru Damian Eshaan Nichani Jason D. Lee 14 75 0 30 Sep 2022
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 150 232 0 04 Mar 2020