ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.11600
  4. Cited By
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
  of Deep Neural Networks
v1v2v3 (latest)

ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks

International Conference on Machine Learning (ICML), 2021
23 February 2021
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
ArXiv (abs)PDFHTML

Papers citing "ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks"

24 / 224 papers shown
Tackling covariate shift with node-based Bayesian neural networks
Tackling covariate shift with node-based Bayesian neural networksInternational Conference on Machine Learning (ICML), 2022
Trung Trinh
Markus Heinonen
Luigi Acerbi
Samuel Kaski
BDLUQCV
221
7
0
06 Jun 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
  Compressible Models
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
276
24
0
25 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
160
16
0
21 May 2022
EXACT: How to Train Your Accuracy
EXACT: How to Train Your AccuracyPattern Recognition Letters (PR), 2022
I. Karpukhin
Stanislav Dereka
Sergey Kolesnikov
238
0
0
19 May 2022
Multimodal Transformer for Nursing Activity Recognition
Multimodal Transformer for Nursing Activity Recognition
Momal Ijaz
Renato Diaz
Chong Chen
ViT
188
32
0
09 Apr 2022
TransGeo: Transformer Is All You Need for Cross-view Image
  Geo-localization
TransGeo: Transformer Is All You Need for Cross-view Image Geo-localizationComputer Vision and Pattern Recognition (CVPR), 2022
Sijie Zhu
M. Shah
Chong Chen
ViT
271
217
0
31 Mar 2022
Improving Generalization in Federated Learning by Seeking Flat Minima
Improving Generalization in Federated Learning by Seeking Flat MinimaEuropean Conference on Computer Vision (ECCV), 2022
Debora Caldarola
Barbara Caputo
Marco Ciccone
FedML
377
138
0
22 Mar 2022
Randomized Sharpness-Aware Training for Boosting Computational
  Efficiency in Deep Learning
Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
160
13
0
18 Mar 2022
Surrogate Gap Minimization Improves Sharpness-Aware Training
Surrogate Gap Minimization Improves Sharpness-Aware TrainingInternational Conference on Learning Representations (ICLR), 2022
Juntang Zhuang
Boqing Gong
Liangzhe Yuan
Huayu Chen
Hartwig Adam
Nicha Dvornek
S. Tatikonda
James Duncan
Ting Liu
300
196
0
15 Mar 2022
Penalizing Gradient Norm for Efficiently Improving Generalization in
  Deep Learning
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep LearningInternational Conference on Machine Learning (ICML), 2022
Yang Zhao
Hao Zhang
Xiuyuan Hu
506
153
0
08 Feb 2022
When Do Flat Minima Optimizers Work?
When Do Flat Minima Optimizers Work?Neural Information Processing Systems (NeurIPS), 2022
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
526
86
0
01 Feb 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
  Optimization Landscape
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization LandscapeInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Devansh Bisla
Jing Wang
A. Choromańska
331
45
0
20 Jan 2022
Generalized Wasserstein Dice Loss, Test-time Augmentation, and
  Transformers for the BraTS 2021 challenge
Generalized Wasserstein Dice Loss, Test-time Augmentation, and Transformers for the BraTS 2021 challenge
Lucas Fidon
Antonio Terpin
Ivan Ezhov
Johannes C. Paetzold
Sébastien Ourselin
Tom Vercauteren
ViTMedIm
148
10
0
24 Dec 2021
Unsupervised Dense Information Retrieval with Contrastive Learning
Unsupervised Dense Information Retrieval with Contrastive Learning
Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Edouard Grave
RALM
765
1,245
0
16 Dec 2021
Sharpness-Aware Minimization with Dynamic Reweighting
Sharpness-Aware Minimization with Dynamic Reweighting
Wenxuan Zhou
Fangyu Liu
Huan Zhang
Muhao Chen
AAML
327
8
0
16 Dec 2021
Sharpness-aware Quantization for Deep Neural Networks
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
480
27
0
24 Nov 2021
DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion
DyTox: Transformers for Continual Learning with DYnamic TOken eXpansionComputer Vision and Pattern Recognition (CVPR), 2021
Arthur Douillard
Alexandre Ramé
Guillaume Couairon
Matthieu Cord
CLL
390
385
0
22 Nov 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary
  regime
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
289
5
0
07 Nov 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
474
116
0
16 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural
  Networks
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Qiufeng Wang
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
416
160
0
07 Oct 2021
Perturbated Gradients Updating within Unit Space for Deep Learning
Perturbated Gradients Updating within Unit Space for Deep Learning
Ching-Hsun Tseng
Liu Cheng
Shin-Jye Lee
Xiaojun Zeng
373
5
0
01 Oct 2021
Where do Models go Wrong? Parameter-Space Saliency Maps for
  Explainability
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability
Roman Levin
Manli Shu
Eitan Borgnia
Furong Huang
Micah Goldblum
Tom Goldstein
FAttAAML
117
12
0
03 Aug 2021
A novel multi-scale loss function for classification problems in machine
  learning
A novel multi-scale loss function for classification problems in machine learningJournal of Computational Physics (JCP), 2021
L. Berlyand
Robert Creese
P. Jabin
145
4
0
04 Jun 2021
Descending through a Crowded Valley - Benchmarking Deep Learning
  Optimizers
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
793
186
0
03 Jul 2020
Previous
12345