Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.05729
Cited By
How Does Sharpness-Aware Minimization Minimize Sharpness?
10 November 2022
Kaiyue Wen
Tengyu Ma
Zhiyuan Li
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How Does Sharpness-Aware Minimization Minimize Sharpness?"
37 / 37 papers shown
Title
Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm
Yilang Zhang
Bingcong Li
G. Giannakis
AAML
34
0
0
11 Jan 2025
SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAM
Xingyu Lyu
Qianqian Xu
Zhiyong Yang
Shaojie Lyu
Qingming Huang
61
0
0
18 Dec 2024
Enhancing generalization in high energy physics using white-box adversarial attacks
Franck Rothen
Samuel Klein
Matthew Leigh
T. Golling
AAML
21
1
0
14 Nov 2024
Reweighting Local Mimina with Tilted SAM
Tian Li
Tianyi Zhou
J. Bilmes
28
0
0
30 Oct 2024
Simplicity Bias via Global Convergence of Sharpness Minimization
Khashayar Gatmiry
Zhiyuan Li
Sashank J. Reddi
Stefanie Jegelka
19
1
0
21 Oct 2024
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
Bingcong Li
Liang Zhang
Niao He
36
3
0
18 Oct 2024
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
Rui Zeng
Xi Chen
Yuwen Pu
Xuhong Zhang
Tianyu Du
Shouling Ji
31
2
0
02 Sep 2024
A Universal Class of Sharpness-Aware Minimization Algorithms
B. Tahmasebi
Ashkan Soleymani
Dara Bahri
Stefanie Jegelka
P. Jaillet
AAML
44
2
0
06 Jun 2024
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Mitchell Springer
Vaishnavh Nagarajan
Aditi Raghunathan
37
2
0
30 May 2024
Why is SAM Robust to Label Noise?
Christina Baek
Zico Kolter
Aditi Raghunathan
NoLa
AAML
33
9
0
06 May 2024
Why are Sensitive Functions Hard for Transformers?
Michael Hahn
Mark Rofin
20
22
0
15 Feb 2024
A Precise Characterization of SGD Stability Using Loss Surface Geometry
Gregory Dexter
Borja Ocejo
S. Keerthi
Aman Gupta
Ayan Acharya
Rajiv Khanna
MLT
15
0
0
22 Jan 2024
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
23
7
0
19 Jan 2024
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
23
1
0
14 Jan 2024
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
22
32
0
30 Nov 2023
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Zixiang Chen
Junkai Zhang
Yiwen Kou
Xiangning Chen
Cho-Jui Hsieh
Quanquan Gu
10
11
0
11 Oct 2023
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Bingcong Li
G. Giannakis
AAML
21
19
0
27 Sep 2023
Generalization error bounds for iterative learning algorithms with bounded updates
Jingwen Fu
Nanning Zheng
27
1
0
10 Sep 2023
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
32
8
0
27 Jul 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
Kaiyue Wen
Zhiyuan Li
Tengyu Ma
FAtt
22
26
0
20 Jul 2023
Why Does Little Robustness Help? Understanding and Improving Adversarial Transferability from Surrogate Training
Yechao Zhang
Shengshan Hu
Leo Yu Zhang
Junyu Shi
Minghui Li
Xiaogeng Liu
Wei Wan
Hai Jin
AAML
22
20
0
15 Jul 2023
FAM: Relative Flatness Aware Minimization
Linara Adilova
Amr Abourayya
Jianning Li
Amin Dada
Henning Petzka
Jan Egger
Jens Kleesiek
Michael Kamp
ODL
8
1
0
05 Jul 2023
The Inductive Bias of Flatness Regularization for Deep Matrix Factorization
Khashayar Gatmiry
Zhiyuan Li
Ching-Yao Chuang
Sashank J. Reddi
Tengyu Ma
Stefanie Jegelka
ODL
9
11
0
22 Jun 2023
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
Dongkuk Si
Chulhee Yun
26
15
0
16 Jun 2023
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
22
6
0
25 May 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization
Yan Dai
Kwangjun Ahn
S. Sra
21
17
0
24 May 2023
Sharpness-Aware Data Poisoning Attack
Pengfei He
Han Xu
J. Ren
Yingqian Cui
Hui Liu
Charu C. Aggarwal
Jiliang Tang
AAML
32
4
0
24 May 2023
Per-Example Gradient Regularization Improves Learning Signals from Noisy Data
Xuran Meng
Yuan Cao
Difan Zou
19
5
0
31 Mar 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
13
7
0
19 Feb 2023
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Atish Agarwala
Yann N. Dauphin
17
20
0
17 Feb 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
17
3
0
24 Jan 2023
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurélien Lucchi
11
10
0
19 Jan 2023
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
18
31
0
25 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima
Peter L. Bartlett
Philip M. Long
Olivier Bousquet
63
34
0
04 Oct 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
72
88
0
19 May 2022
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
83
98
0
13 Oct 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
1