ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.00661
  4. Cited By
When Do Flat Minima Optimizers Work?

When Do Flat Minima Optimizers Work?

1 February 2022
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
    ODL
ArXivPDFHTML

Papers citing "When Do Flat Minima Optimizers Work?"

50 / 58 papers shown
Title
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Yeoreum Lee
Jinwook Jung
Sungyong Baik
MoMe
40
0
0
20 Apr 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
80
0
0
07 Mar 2025
Evolutionary Optimization of Model Merging Recipes
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
116
100
0
28 Jan 2025
Computational Analysis of Yaredawi YeZema Silt in Ethiopian Orthodox
  Tewahedo Church Chants
Computational Analysis of Yaredawi YeZema Silt in Ethiopian Orthodox Tewahedo Church Chants
Mequanent Argaw Muluneh
Yan-Tsung Peng
Li Su
49
0
0
25 Dec 2024
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Aodi Li
Liansheng Zhuang
Xiao Long
Minghong Yao
Shafei Wang
186
0
0
18 Dec 2024
Towards Understanding the Role of Sharpness-Aware Minimization
  Algorithms for Out-of-Distribution Generalization
Towards Understanding the Role of Sharpness-Aware Minimization Algorithms for Out-of-Distribution Generalization
Samuel Schapiro
Han Zhao
76
0
0
06 Dec 2024
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Chaeyun Jang
Hyungi Lee
Jungtaek Kim
Juho Lee
MoMe
45
0
0
11 Nov 2024
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Zhanpeng Zhou
Mingze Wang
Yuchen Mao
Bingrui Li
Junchi Yan
AAML
62
0
0
14 Oct 2024
Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
Haiquan Lu
Xiaotian Liu
Yefan Zhou
Qunli Li
Kurt Keutzer
Michael W. Mahoney
Yujun Yan
Huanrui Yang
Yaoqing Yang
43
1
0
17 Jul 2024
A Universal Class of Sharpness-Aware Minimization Algorithms
A Universal Class of Sharpness-Aware Minimization Algorithms
B. Tahmasebi
Ashkan Soleymani
Dara Bahri
Stefanie Jegelka
P. Jaillet
AAML
52
2
0
06 Jun 2024
Sharpness-Aware Minimization Enhances Feature Quality via Balanced
  Learning
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Mitchell Springer
Vaishnavh Nagarajan
Aditi Raghunathan
44
5
0
30 May 2024
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Xin-Chun Li
Jinli Tang
Bo Zhang
Lan Li
De-Chuan Zhan
49
2
0
21 May 2024
Generalization Measures for Zero-Shot Cross-Lingual Transfer
Generalization Measures for Zero-Shot Cross-Lingual Transfer
Saksham Bassi
Duygu Ataman
Kyunghyun Cho
29
0
0
24 Apr 2024
Beyond Single-Model Views for Deep Learning: Optimization versus
  Generalizability of Stochastic Optimization Algorithms
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
32
0
0
01 Mar 2024
Mirror Gradient: Towards Robust Multimodal Recommender Systems via
  Exploring Flat Local Minima
Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima
Shan Zhong
Zhongzhan Huang
Daifeng Li
Wushao Wen
Jinghui Qin
Liang Lin
22
12
0
17 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
30
6
0
14 Feb 2024
Stabilizing Sharpness-aware Minimization Through A Simple
  Renormalization Strategy
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
34
1
0
14 Jan 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
44
1
0
29 Nov 2023
Seeking Flat Minima with Mean Teacher on Semi- and Weakly-Supervised
  Domain Generalization for Object Detection
Seeking Flat Minima with Mean Teacher on Semi- and Weakly-Supervised Domain Generalization for Object Detection
Ryosuke Furuta
Yoichi Sato
28
0
0
30 Oct 2023
TRAM: Bridging Trust Regions and Sharpness Aware Minimization
TRAM: Bridging Trust Regions and Sharpness Aware Minimization
Tom Sherborne
Naomi Saphra
Pradeep Dasigi
Hao Peng
32
4
0
05 Oct 2023
RSAM: Learning on manifolds with Riemannian Sharpness-aware Minimization
RSAM: Learning on manifolds with Riemannian Sharpness-aware Minimization
Kenneth Allen
Hoang-Phi Nguyen
Tung Pham
Ming-Jun Lai
Mehrtash Harandi
Dinh Q. Phung
Trung Le
AAML
34
3
0
29 Sep 2023
Probabilistic Weight Fixing: Large-scale training of neural network
  weight uncertainties for quantization
Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization
Christopher Subia-Waud
S. Dasmahapatra
UQCV
MQ
21
0
0
24 Sep 2023
Exploring Flat Minima for Domain Generalization with Large Learning
  Rates
Exploring Flat Minima for Domain Generalization with Large Learning Rates
Jian Zhang
Lei Qi
Yinghuan Shi
Yang Gao
41
2
0
12 Sep 2023
No Train No Gain: Revisiting Efficient Training Algorithms For
  Transformer-based Language Models
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
20
41
0
12 Jul 2023
FAM: Relative Flatness Aware Minimization
FAM: Relative Flatness Aware Minimization
Linara Adilova
Amr Abourayya
Jianning Li
Amin Dada
Henning Petzka
Jan Egger
Jens Kleesiek
Michael Kamp
ODL
24
1
0
05 Jul 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
37
10
0
25 Jun 2023
PLASTIC: Improving Input and Label Plasticity for Sample Efficient
  Reinforcement Learning
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning
Hojoon Lee
Hanseul Cho
Hyunseung Kim
Daehoon Gwak
Joonkee Kim
Jaegul Choo
Se-Young Yun
Chulhee Yun
OffRL
82
25
0
19 Jun 2023
Practical Sharpness-Aware Minimization Cannot Converge All the Way to
  Optima
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
Dongkuk Si
Chulhee Yun
28
15
0
16 Jun 2023
The Split Matters: Flat Minima Methods for Improving the Performance of
  GNNs
The Split Matters: Flat Minima Methods for Improving the Performance of GNNs
N. Lell
A. Scherp
40
1
0
15 Jun 2023
Gradient Ascent Post-training Enhances Language Model Generalization
Gradient Ascent Post-training Enhances Language Model Generalization
Dongkeun Yoon
Joel Jang
Sungdong Kim
Minjoon Seo
VLM
AI4CE
21
3
0
12 Jun 2023
Differentially Private Sharpness-Aware Training
Differentially Private Sharpness-Aware Training
Jinseong Park
Hoki Kim
Yujin Choi
Jaewook Lee
27
8
0
09 Jun 2023
Normalization Layers Are All That Sharpness-Aware Minimization Needs
Normalization Layers Are All That Sharpness-Aware Minimization Needs
Maximilian Mueller
Tiffany J. Vlaar
David Rolnick
Matthias Hein
27
18
0
07 Jun 2023
Optimal Transport Model Distributional Robustness
Optimal Transport Model Distributional Robustness
Van-Anh Nguyen
Trung Le
Anh Tuan Bui
Thanh-Toan Do
Dinh Q. Phung
OOD
30
3
0
07 Jun 2023
Sharpness-Aware Minimization Revisited: Weighted Sharpness as a
  Regularization Term
Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term
Yun Yue
Jiadi Jiang
Zhiling Ye
Ni Gao
Yongchao Liu
Kecheng Zhang
MLAU
ODL
17
11
0
25 May 2023
How to escape sharp minima with random perturbations
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
32
6
0
25 May 2023
The MiniPile Challenge for Data-Efficient Language Models
The MiniPile Challenge for Data-Efficient Language Models
Jean Kaddour
MoE
ALM
24
40
0
17 Apr 2023
Uncertainty-Aware Natural Language Inference with Stochastic Weight
  Averaging
Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging
Aarne Talman
H. Çelikkanat
Sami Virpioja
Markus Heinonen
Jörg Tiedemann
BDL
UQCV
26
7
0
10 Apr 2023
Going Further: Flatness at the Rescue of Early Stopping for Adversarial
  Example Transferability
Going Further: Flatness at the Rescue of Early Stopping for Adversarial Example Transferability
Martin Gubri
Maxime Cordy
Yves Le Traon
AAML
20
3
1
05 Apr 2023
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases
Aengus Lynch
G. Dovonon
Jean Kaddour
Ricardo M. A. Silva
189
30
0
09 Mar 2023
On Statistical Properties of Sharpness-Aware Minimization: Provable
  Guarantees
On Statistical Properties of Sharpness-Aware Minimization: Provable Guarantees
Kayhan Behdin
Rahul Mazumder
38
6
0
23 Feb 2023
Why is parameter averaging beneficial in SGD? An objective smoothing
  perspective
Why is parameter averaging beneficial in SGD? An objective smoothing perspective
Atsushi Nitanda
Ryuhei Kikuchi
Shugo Maeda
Denny Wu
FedML
25
0
0
18 Feb 2023
Flat Seeking Bayesian Neural Networks
Flat Seeking Bayesian Neural Networks
Van-Anh Nguyen
L. Vuong
Hoang Phan
Thanh-Toan Do
Dinh Q. Phung
Trung Le
BDL
24
8
0
06 Feb 2023
An SDE for Modeling SAM: Theory and Insights
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurelien Lucchi
23
13
0
19 Jan 2023
Improving Generalization of Pre-trained Language Models via Stochastic
  Weight Averaging
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Peng Lu
I. Kobyzev
Mehdi Rezagholizadeh
Ahmad Rashid
A. Ghodsi
Philippe Langlais
MoMe
33
11
0
12 Dec 2022
Mechanistic Mode Connectivity
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
32
45
0
15 Nov 2022
Diverse Weight Averaging for Out-of-Distribution Generalization
Diverse Weight Averaging for Out-of-Distribution Generalization
Alexandre Ramé
Matthieu Kirchmeyer
Thibaud Rahier
A. Rakotomamonjy
Patrick Gallinari
Matthieu Cord
OOD
199
128
0
19 May 2022
Stochastic Weight Averaging Revisited
Stochastic Weight Averaging Revisited
Hao Guo
Jiyong Jin
B. Liu
27
29
0
03 Jan 2022
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
124
98
0
16 Oct 2021
Relation Prediction as an Auxiliary Training Objective for Improving
  Multi-Relational Graph Representations
Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations
Yihong Chen
Pasquale Minervini
Sebastian Riedel
Pontus Stenetorp
88
43
0
06 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
12
Next