ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.01412
  4. Cited By
Sharpness-Aware Minimization for Efficiently Improving Generalization

Sharpness-Aware Minimization for Efficiently Improving Generalization

3 October 2020
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
    AAML
ArXivPDFHTML

Papers citing "Sharpness-Aware Minimization for Efficiently Improving Generalization"

50 / 867 papers shown
Title
Bootstrap Generalization Ability from Loss Landscape Perspective
Bootstrap Generalization Ability from Loss Landscape Perspective
Huanran Chen
Shitong Shao
Ziyi Wang
Zirui Shang
Jin Chen
Xiaofeng Ji
Xinxiao Wu
OOD
61
17
0
18 Sep 2022
Towards Bridging the Performance Gaps of Joint Energy-based Models
Towards Bridging the Performance Gaps of Joint Energy-based Models
Xiulong Yang
Qing Su
Shihao Ji
VLM
8
12
0
16 Sep 2022
Combining Metric Learning and Attention Heads For Accurate and Efficient
  Multilabel Image Classification
Combining Metric Learning and Attention Heads For Accurate and Efficient Multilabel Image Classification
K. Prokofiev
V. Sovrasov
VLM
23
9
0
14 Sep 2022
Communication-Efficient and Privacy-Preserving Feature-based Federated
  Transfer Learning
Communication-Efficient and Privacy-Preserving Feature-based Federated Transfer Learning
Feng Wang
M. C. Gursoy
Senem Velipasalar
14
2
0
12 Sep 2022
Unraveling the Connections between Privacy and Certified Robustness in
  Federated Learning Against Poisoning Attacks
Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks
Chulin Xie
Yunhui Long
Pin-Yu Chen
Qinbin Li
Arash Nourian
Sanmi Koyejo
Bo Li
FedML
35
13
0
08 Sep 2022
Investigating the Impact of Model Misspecification in Neural
  Simulation-based Inference
Investigating the Impact of Model Misspecification in Neural Simulation-based Inference
Patrick W Cannon
Daniel Ward
Sebastian M. Schmon
22
34
0
05 Sep 2022
ASTra: A Novel Algorithm-Level Approach to Imbalanced Classification
ASTra: A Novel Algorithm-Level Approach to Imbalanced Classification
David Twomey
D. Gorse
8
0
0
04 Sep 2022
A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function
  Perspective
A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective
Chanwoo Park
Sangdoo Yun
Sanghyuk Chun
AAML
21
32
0
21 Aug 2022
Self-Supervised Multimodal Fusion Transformer for Passive Activity
  Recognition
Self-Supervised Multimodal Fusion Transformer for Passive Activity Recognition
Armand K. Koupai
M. J. Bocus
Raúl Santos-Rodríguez
Robert Piechocki
Ryan McConville
ViT
30
9
0
15 Aug 2022
Model Generalization: A Sharpness Aware Optimization Perspective
Model Generalization: A Sharpness Aware Optimization Perspective
Jozef Marus Coldenhoff
Chengkun Li
Yurui Zhu
13
2
0
14 Aug 2022
Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Manzil Zaheer
A. S. Rawat
Seungyeon Kim
Chong You
Himanshu Jain
Andreas Veit
Rob Fergus
Surinder Kumar
VLM
16
2
0
14 Aug 2022
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep
  Models
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Xingyu Xie
Pan Zhou
Huan Li
Zhouchen Lin
Shuicheng Yan
ODL
35
148
0
13 Aug 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
20
2
0
11 Aug 2022
Maintaining Performance with Less Data
Maintaining Performance with Less Data
Dominic Sanderson
Tatiana Kalgonova
33
1
0
03 Aug 2022
Understanding Adversarial Robustness of Vision Transformers via Cauchy
  Problem
Understanding Adversarial Robustness of Vision Transformers via Cauchy Problem
Zheng Wang
Wenjie Ruan
ViT
36
8
0
01 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust
  Quantization
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
14
1
0
31 Jul 2022
CrAM: A Compression-Aware Minimizer
CrAM: A Compression-Aware Minimizer
Alexandra Peste
Adrian Vladu
Eldar Kurtic
Christoph H. Lampert
Dan Alistarh
24
8
0
28 Jul 2022
LGV: Boosting Adversarial Example Transferability from Large Geometric
  Vicinity
LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity
Martin Gubri
Maxime Cordy
Mike Papadakis
Yves Le Traon
Koushik Sen
AAML
27
51
0
26 Jul 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge
  of Stability
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
42
0
26 Jul 2022
On the benefits of non-linear weight updates
On the benefits of non-linear weight updates
Paul Norridge
18
0
0
25 Jul 2022
Affective Behavior Analysis using Action Unit Relation Graph and
  Multi-task Cross Attention
Affective Behavior Analysis using Action Unit Relation Graph and Multi-task Cross Attention
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
CVBM
11
5
0
21 Jul 2022
HSE-NN Team at the 4th ABAW Competition: Multi-task Emotion Recognition
  and Learning from Synthetic Images
HSE-NN Team at the 4th ABAW Competition: Multi-task Emotion Recognition and Learning from Synthetic Images
Andrey V. Savchenko
CVBM
14
7
0
19 Jul 2022
Progress and limitations of deep networks to recognize objects in
  unusual poses
Progress and limitations of deep networks to recognize objects in unusual poses
Amro Abbas
Stéphane Deny
OOD
AAML
15
17
0
16 Jul 2022
Multi-modal Depression Estimation based on Sub-attentional Fusion
Multi-modal Depression Estimation based on Sub-attentional Fusion
P. Wei
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
17
37
0
13 Jul 2022
The alignment property of SGD noise and how it helps select flat minima:
  A stability analysis
The alignment property of SGD noise and how it helps select flat minima: A stability analysis
Lei Wu
Mingze Wang
Weijie Su
MLT
22
31
0
06 Jul 2022
An Empirical Study of Implicit Regularization in Deep Offline RL
An Empirical Study of Implicit Regularization in Deep Offline RL
Çağlar Gülçehre
Srivatsan Srinivasan
Jakub Sygnowski
Georg Ostrovski
Mehrdad Farajtabar
Matt Hoffman
Razvan Pascanu
Arnaud Doucet
OffRL
14
16
0
05 Jul 2022
PoF: Post-Training of Feature Extractor for Improving Generalization
PoF: Post-Training of Feature Extractor for Improving Generalization
Ikuro Sato
Ryota Yamada
Masayuki Tanaka
Nakamasa Inoue
Rei Kawakami
11
2
0
05 Jul 2022
Explanation-based Counterfactual Retraining(XCR): A Calibration Method
  for Black-box Models
Explanation-based Counterfactual Retraining(XCR): A Calibration Method for Black-box Models
Liu Zhendong
Wenyu Jiang
Yan Zhang
Chongjun Wang
CML
6
0
0
22 Jun 2022
On the Maximum Hessian Eigenvalue and Generalization
On the Maximum Hessian Eigenvalue and Generalization
Simran Kaur
Jérémy E. Cohen
Zachary Chase Lipton
21
41
0
21 Jun 2022
When Does Re-initialization Work?
When Does Re-initialization Work?
Sheheryar Zaidi
Tudor Berariu
Hyunjik Kim
J. Bornschein
Claudia Clopath
Yee Whye Teh
Razvan Pascanu
40
10
0
20 Jun 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Zhengqi He
Zeke Xie
Quanzhi Zhu
Zengchang Qin
69
27
0
17 Jun 2022
Revisiting Self-Distillation
Revisiting Self-Distillation
M. Pham
Minsu Cho
Ameya Joshi
C. Hegde
18
22
0
17 Jun 2022
Methods for Estimating and Improving Robustness of Language Models
Methods for Estimating and Improving Robustness of Language Models
Michal Stefánik
8
1
0
16 Jun 2022
A Closer Look at Smoothness in Domain Adversarial Training
A Closer Look at Smoothness in Domain Adversarial Training
Harsh Rangwani
Sumukh K Aithal
Mayank Mishra
Arihant Jain
R. Venkatesh Babu
27
119
0
16 Jun 2022
Action Spotting using Dense Detection Anchors Revisited: Submission to
  the SoccerNet Challenge 2022
Action Spotting using Dense Detection Anchors Revisited: Submission to the SoccerNet Challenge 2022
J. C. V. Soares
Avijit Shah
29
14
0
15 Jun 2022
Efficient Adaptive Ensembling for Image Classification
Efficient Adaptive Ensembling for Image Classification
A. Bruno
Davide Moroni
M. Martinelli
28
18
0
15 Jun 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
37
69
0
14 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
26
133
0
13 Jun 2022
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
Minyoung Kim
Da Li
S. Hu
Timothy M. Hospedales
AAML
14
68
0
10 Jun 2022
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
Momin Abbas
Quan-Wu Xiao
Lisha Chen
Pin-Yu Chen
Tianyi Chen
21
78
0
08 Jun 2022
Robust Fine-Tuning of Deep Neural Networks with Hessian-based
  Generalization Guarantees
Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
Haotian Ju
Dongyue Li
Hongyang R. Zhang
35
28
0
06 Jun 2022
Generalized Federated Learning via Sharpness Aware Minimization
Generalized Federated Learning via Sharpness Aware Minimization
Zhe Qu
Xingyu Li
Rui Duan
Yaojiang Liu
Bo Tang
Zhuo Lu
FedML
20
131
0
06 Jun 2022
Transforming medical imaging with Transformers? A comparative review of
  key properties, current progresses, and future perspectives
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
21
20
0
02 Jun 2022
Beyond accuracy: generalization properties of bio-plausible temporal
  credit assignment rules
Beyond accuracy: generalization properties of bio-plausible temporal credit assignment rules
Yuhan Helena Liu
Arna Ghosh
Blake A. Richards
E. Shea-Brown
Guillaume Lajoie
21
9
0
02 Jun 2022
Rotate the ReLU to implicitly sparsify deep networks
Rotate the ReLU to implicitly sparsify deep networks
Nancy Nayak
Sheetal Kalyani
12
0
0
01 Jun 2022
A comparative study between vision transformers and CNNs in digital
  pathology
A comparative study between vision transformers and CNNs in digital pathology
Luca Deininger
Bernhard Stimpel
Anil Yüce
Samaneh Abbasi-Sureshjani
Simon Schönenberger
P. Ocampo
Konstanty Korski
F. Gaire
ViT
MedIm
17
30
0
01 Jun 2022
Anti-virus Autobots: Predicting More Infectious Virus Variants for
  Pandemic Prevention through Deep Learning
Anti-virus Autobots: Predicting More Infectious Virus Variants for Pandemic Prevention through Deep Learning
Glenda Tan Hui En
K. Erhn
Bingquan Shen
18
0
0
30 May 2022
WaveMix: A Resource-efficient Neural Network for Image Analysis
WaveMix: A Resource-efficient Neural Network for Image Analysis
Pranav Jeevan
Kavitha Viswanathan
S. AnanduA
A. Sethi
15
20
0
28 May 2022
Sharpness-Aware Training for Free
Sharpness-Aware Training for Free
Jiawei Du
Daquan Zhou
Jiashi Feng
Vincent Y. F. Tan
Joey Tianyi Zhou
AAML
25
92
0
27 May 2022
An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
  Multitask Learning Systems
An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems
Andrea Gesmundo
J. Dean
31
23
0
25 May 2022
Previous
123...131415161718
Next