ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05729
  4. Cited By
How Does Sharpness-Aware Minimization Minimize Sharpness?

How Does Sharpness-Aware Minimization Minimize Sharpness?

10 November 2022
Kaiyue Wen
Tengyu Ma
Zhiyuan Li
    AAML
ArXivPDFHTML

Papers citing "How Does Sharpness-Aware Minimization Minimize Sharpness?"

37 / 37 papers shown
Title
Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm
Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm
Yilang Zhang
Bingcong Li
G. Giannakis
AAML
34
0
0
11 Jan 2025
SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise
  SAM
SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAM
Xingyu Lyu
Qianqian Xu
Zhiyong Yang
Shaojie Lyu
Qingming Huang
61
0
0
18 Dec 2024
Enhancing generalization in high energy physics using white-box
  adversarial attacks
Enhancing generalization in high energy physics using white-box adversarial attacks
Franck Rothen
Samuel Klein
Matthew Leigh
T. Golling
AAML
18
1
0
14 Nov 2024
Reweighting Local Mimina with Tilted SAM
Reweighting Local Mimina with Tilted SAM
Tian Li
Tianyi Zhou
J. Bilmes
25
0
0
30 Oct 2024
Simplicity Bias via Global Convergence of Sharpness Minimization
Simplicity Bias via Global Convergence of Sharpness Minimization
Khashayar Gatmiry
Zhiyuan Li
Sashank J. Reddi
Stefanie Jegelka
19
1
0
21 Oct 2024
Implicit Regularization of Sharpness-Aware Minimization for
  Scale-Invariant Problems
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
Bingcong Li
Liang Zhang
Niao He
36
3
0
18 Oct 2024
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
Rui Zeng
Xi Chen
Yuwen Pu
Xuhong Zhang
Tianyu Du
Shouling Ji
31
2
0
02 Sep 2024
A Universal Class of Sharpness-Aware Minimization Algorithms
A Universal Class of Sharpness-Aware Minimization Algorithms
B. Tahmasebi
Ashkan Soleymani
Dara Bahri
Stefanie Jegelka
P. Jaillet
AAML
39
2
0
06 Jun 2024
Sharpness-Aware Minimization Enhances Feature Quality via Balanced
  Learning
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Mitchell Springer
Vaishnavh Nagarajan
Aditi Raghunathan
31
2
0
30 May 2024
Why is SAM Robust to Label Noise?
Why is SAM Robust to Label Noise?
Christina Baek
Zico Kolter
Aditi Raghunathan
NoLa
AAML
33
9
0
06 May 2024
Why are Sensitive Functions Hard for Transformers?
Why are Sensitive Functions Hard for Transformers?
Michael Hahn
Mark Rofin
20
22
0
15 Feb 2024
A Precise Characterization of SGD Stability Using Loss Surface Geometry
A Precise Characterization of SGD Stability Using Loss Surface Geometry
Gregory Dexter
Borja Ocejo
S. Keerthi
Aman Gupta
Ayan Acharya
Rajiv Khanna
MLT
15
0
0
22 Jan 2024
Neglected Hessian component explains mysteries in Sharpness
  regularization
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
21
3
0
19 Jan 2024
Stabilizing Sharpness-aware Minimization Through A Simple
  Renormalization Strategy
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
18
1
0
14 Jan 2024
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce
  Grokking
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
20
32
0
30 Nov 2023
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Zixiang Chen
Junkai Zhang
Yiwen Kou
Xiangning Chen
Cho-Jui Hsieh
Quanquan Gu
8
11
0
11 Oct 2023
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Bingcong Li
G. Giannakis
AAML
19
11
0
27 Sep 2023
Generalization error bounds for iterative learning algorithms with
  bounded updates
Generalization error bounds for iterative learning algorithms with bounded updates
Jingwen Fu
Nanning Zheng
27
1
0
10 Sep 2023
The Marginal Value of Momentum for Small Learning Rate SGD
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
32
8
0
27 Jul 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To
  Achieve Better Generalization
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
Kaiyue Wen
Zhiyuan Li
Tengyu Ma
FAtt
19
26
0
20 Jul 2023
Why Does Little Robustness Help? Understanding and Improving Adversarial
  Transferability from Surrogate Training
Why Does Little Robustness Help? Understanding and Improving Adversarial Transferability from Surrogate Training
Yechao Zhang
Shengshan Hu
Leo Yu Zhang
Junyu Shi
Minghui Li
Xiaogeng Liu
Wei Wan
Hai Jin
AAML
22
20
0
15 Jul 2023
FAM: Relative Flatness Aware Minimization
FAM: Relative Flatness Aware Minimization
Linara Adilova
Amr Abourayya
Jianning Li
Amin Dada
Henning Petzka
Jan Egger
Jens Kleesiek
Michael Kamp
ODL
8
1
0
05 Jul 2023
The Inductive Bias of Flatness Regularization for Deep Matrix
  Factorization
The Inductive Bias of Flatness Regularization for Deep Matrix Factorization
Khashayar Gatmiry
Zhiyuan Li
Ching-Yao Chuang
Sashank J. Reddi
Tengyu Ma
Stefanie Jegelka
ODL
9
11
0
22 Jun 2023
Practical Sharpness-Aware Minimization Cannot Converge All the Way to
  Optima
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
Dongkuk Si
Chulhee Yun
26
15
0
16 Jun 2023
How to escape sharp minima with random perturbations
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
19
6
0
25 May 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization
The Crucial Role of Normalization in Sharpness-Aware Minimization
Yan Dai
Kwangjun Ahn
S. Sra
18
17
0
24 May 2023
Sharpness-Aware Data Poisoning Attack
Sharpness-Aware Data Poisoning Attack
Pengfei He
Han Xu
J. Ren
Yingqian Cui
Hui Liu
Charu C. Aggarwal
Jiliang Tang
AAML
32
4
0
24 May 2023
Per-Example Gradient Regularization Improves Learning Signals from Noisy
  Data
Per-Example Gradient Regularization Improves Learning Signals from Noisy Data
Xuran Meng
Yuan Cao
Difan Zou
16
5
0
31 Mar 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
13
7
0
19 Feb 2023
SAM operates far from home: eigenvalue regularization as a dynamical
  phenomenon
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Atish Agarwala
Yann N. Dauphin
14
20
0
17 Feb 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
15
3
0
24 Jan 2023
An SDE for Modeling SAM: Theory and Insights
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurélien Lucchi
9
10
0
19 Jan 2023
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for
  Language Models
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
16
31
0
25 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines
  and Drifting Towards Wide Minima
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima
Peter L. Bartlett
Philip M. Long
Olivier Bousquet
60
34
0
04 Oct 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
69
88
0
19 May 2022
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
83
98
0
13 Oct 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,696
0
15 Sep 2016
1