Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.01513
Cited By
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima
4 October 2022
Peter L. Bartlett
Philip M. Long
Olivier Bousquet
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima"
32 / 32 papers shown
Title
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Sunwoo Lee
91
0
0
18 Mar 2025
Sharpness-Aware Minimization: General Analysis and Improved Rates
Dimitris Oikonomou
Nicolas Loizou
60
0
0
04 Mar 2025
Reweighting Local Mimina with Tilted SAM
Tian Li
Tianyi Zhou
J. Bilmes
28
0
0
30 Oct 2024
Simplicity Bias via Global Convergence of Sharpness Minimization
Khashayar Gatmiry
Zhiyuan Li
Sashank J. Reddi
Stefanie Jegelka
19
1
0
21 Oct 2024
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
Bingcong Li
Liang Zhang
Niao He
36
3
0
18 Oct 2024
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Zhanpeng Zhou
Mingze Wang
Yuchen Mao
Bingrui Li
Junchi Yan
AAML
57
0
0
14 Oct 2024
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics
Ankit Vani
Frederick Tung
Gabriel L. Oliveira
Hossein Sharifi-Noghabi
AAML
28
0
0
10 Jun 2024
A Universal Class of Sharpness-Aware Minimization Algorithms
B. Tahmasebi
Ashkan Soleymani
Dara Bahri
Stefanie Jegelka
P. Jaillet
AAML
41
2
0
06 Jun 2024
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
47
4
1
25 May 2024
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
19
0
0
01 Mar 2024
Learning Associative Memories with Gradient Descent
Vivien A. Cabannes
Berfin Simsek
A. Bietti
30
3
0
28 Feb 2024
A Precise Characterization of SGD Stability Using Loss Surface Geometry
Gregory Dexter
Borja Ocejo
S. Keerthi
Aman Gupta
Ayan Acharya
Rajiv Khanna
MLT
15
0
0
22 Jan 2024
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
20
1
0
14 Jan 2024
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Zixiang Chen
Junkai Zhang
Yiwen Kou
Xiangning Chen
Cho-Jui Hsieh
Quanquan Gu
10
11
0
11 Oct 2023
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Bingcong Li
G. Giannakis
AAML
19
19
0
27 Sep 2023
Sharpness-Aware Minimization and the Edge of Stability
Philip M. Long
Peter L. Bartlett
AAML
22
9
0
21 Sep 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
Kaiyue Wen
Zhiyuan Li
Tengyu Ma
FAtt
19
26
0
20 Jul 2023
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
Dongkuk Si
Chulhee Yun
26
15
0
16 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
22
9
0
01 Jun 2023
Sharpness-Aware Minimization Leads to Low-Rank Features
Maksym Andriushchenko
Dara Bahri
H. Mobahi
Nicolas Flammarion
AAML
17
25
0
25 May 2023
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
19
6
0
25 May 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization
Yan Dai
Kwangjun Ahn
S. Sra
21
17
0
24 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
24
16
0
19 May 2023
On Statistical Properties of Sharpness-Aware Minimization: Provable Guarantees
Kayhan Behdin
Rahul Mazumder
14
5
0
23 Feb 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
13
7
0
19 Feb 2023
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Atish Agarwala
Yann N. Dauphin
14
20
0
17 Feb 2023
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurélien Lucchi
9
10
0
19 Jan 2023
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
72
88
0
19 May 2022
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
119
82
0
16 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Joey Tianyi Zhou
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
99
132
0
07 Oct 2021
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
105
146
0
06 Nov 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
1