ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.01412
  4. Cited By
Sharpness-Aware Minimization for Efficiently Improving Generalization

Sharpness-Aware Minimization for Efficiently Improving Generalization

3 October 2020
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
    AAML
ArXivPDFHTML

Papers citing "Sharpness-Aware Minimization for Efficiently Improving Generalization"

50 / 867 papers shown
Title
Problem-dependent attention and effort in neural networks with
  applications to image resolution and model selection
Problem-dependent attention and effort in neural networks with applications to image resolution and model selection
Chris Rohlfs
16
4
0
05 Jan 2022
Stochastic Weight Averaging Revisited
Stochastic Weight Averaging Revisited
Hao Guo
Jiyong Jin
B. Liu
22
29
0
03 Jan 2022
Generalized Wasserstein Dice Loss, Test-time Augmentation, and
  Transformers for the BraTS 2021 challenge
Generalized Wasserstein Dice Loss, Test-time Augmentation, and Transformers for the BraTS 2021 challenge
Lucas Fidon
Suprosanna Shit
Ivan Ezhov
Johannes C. Paetzold
Sébastien Ourselin
Tom Kamiel Magda Vercauteren
ViT
MedIm
28
8
0
24 Dec 2021
FedLGA: Towards System-Heterogeneity of Federated Learning via Local
  Gradient Approximation
FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation
Xingyu Li
Zhe Qu
Bo Tang
Zhuo Lu
FedML
19
25
0
22 Dec 2021
Learned Queries for Efficient Local Attention
Learned Queries for Efficient Local Attention
Moab Arar
Ariel Shamir
Amit H. Bermano
ViT
36
29
0
21 Dec 2021
An Empirical Investigation of the Role of Pre-training in Lifelong
  Learning
An Empirical Investigation of the Role of Pre-training in Lifelong Learning
Sanket Vaibhav Mehta
Darshan Patil
Sarath Chandar
Emma Strubell
CLL
37
135
0
16 Dec 2021
Sharpness-Aware Minimization with Dynamic Reweighting
Sharpness-Aware Minimization with Dynamic Reweighting
Wenxuan Zhou
Fangyu Liu
Huan Zhang
Muhao Chen
AAML
19
8
0
16 Dec 2021
Bootstrapping ViTs: Towards Liberating Vision Transformers from
  Pre-training
Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training
Haofei Zhang
Jiarui Duan
Mengqi Xue
Jie Song
Li Sun
Mingli Song
ViT
AI4CE
16
16
0
07 Dec 2021
Scaling Up Influence Functions
Scaling Up Influence Functions
Andrea Schioppa
Polina Zablotskaia
David Vilar
Artem Sokolov
TDI
25
90
0
06 Dec 2021
Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized
  Stochastic Gradient Descent
Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Wei Zhang
Mingrui Liu
Yu Feng
Xiaodong Cui
Brian Kingsbury
Yuhai Tu
6
3
0
02 Dec 2021
Neuron with Steady Response Leads to Better Generalization
Neuron with Steady Response Leads to Better Generalization
Qiang Fu
Lun Du
Haitao Mao
Xu Chen
Wei Fang
Shi Han
Dongmei Zhang
17
5
0
30 Nov 2021
Recurrent Vision Transformer for Solving Visual Reasoning Problems
Recurrent Vision Transformer for Solving Visual Reasoning Problems
Nicola Messina
Giuseppe Amato
F. Carrara
Claudio Gennaro
Fabrizio Falchi
ViT
LRM
17
11
0
29 Nov 2021
Sharpness-aware Quantization for Deep Neural Networks
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
27
24
0
24 Nov 2021
HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving
  Generalization and Quantization Performance
HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance
Huanrui Yang
Xiaoxuan Yang
Neil Zhenqiang Gong
Yiran Chen
MQ
8
10
0
23 Nov 2021
Reasonable Effectiveness of Random Weighting: A Litmus Test for
  Multi-Task Learning
Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning
Baijiong Lin
Feiyang Ye
Yu Zhang
Ivor W. Tsang
34
93
0
20 Nov 2021
TransMorph: Transformer for unsupervised medical image registration
TransMorph: Transformer for unsupervised medical image registration
Junyu Chen
Eric C. Frey
Yufan He
W. Paul Segars
Ye Li
Yong Du
ViT
MedIm
29
186
0
19 Nov 2021
Diabetic Foot Ulcer Grand Challenge 2021: Evaluation and Summary
Diabetic Foot Ulcer Grand Challenge 2021: Evaluation and Summary
B. Cassidy
Connah Kendrick
N. Reeves
Joseph M Pappachan
C. O'Shea
D. Armstrong
Moi Hoon Yap
14
21
0
19 Nov 2021
Attention Mechanisms in Computer Vision: A Survey
Attention Mechanisms in Computer Vision: A Survey
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
19
1,633
0
15 Nov 2021
Convolutional Nets Versus Vision Transformers for Diabetic Foot Ulcer
  Classification
Convolutional Nets Versus Vision Transformers for Diabetic Foot Ulcer Classification
Adrian Galdran
G. Carneiro
M. A. G. Ballester
UQCV
MedIm
11
20
0
12 Nov 2021
Constrained Instance and Class Reweighting for Robust Learning under
  Label Noise
Constrained Instance and Class Reweighting for Robust Learning under Label Noise
Abhishek Kumar
Ehsan Amid
NoLa
25
19
0
09 Nov 2021
Improved Regularization and Robustness for Fine-tuning in Neural
  Networks
Improved Regularization and Robustness for Fine-tuning in Neural Networks
Dongyue Li
Hongyang R. Zhang
NoLa
49
54
0
08 Nov 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary
  regime
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
26
4
0
07 Nov 2021
On the Effectiveness of Interpretable Feedforward Neural Network
On the Effectiveness of Interpretable Feedforward Neural Network
Miles Q. Li
Benjamin C. M. Fung
Adel Abusitta
FaML
AI4CE
15
3
0
03 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
22
14
0
01 Nov 2021
CvS: Classification via Segmentation For Small Datasets
CvS: Classification via Segmentation For Small Datasets
Nooshin Mojab
Philip S. Yu
J. Hallak
Darvin Yi
20
4
0
29 Oct 2021
CAP: Co-Adversarial Perturbation on Weights and Features for Improving
  Generalization of Graph Neural Networks
CAP: Co-Adversarial Perturbation on Weights and Features for Improving Generalization of Graph Neural Networks
Hao Xue
Kaixiong Zhou
Tianlong Chen
Kai Guo
Xia Hu
Yi Chang
Xin Wang
AAML
18
15
0
28 Oct 2021
RoMA: Robust Model Adaptation for Offline Model-based Optimization
RoMA: Robust Model Adaptation for Offline Model-based Optimization
Sihyun Yu
Sungsoo Ahn
Le Song
Jinwoo Shin
OffRL
16
31
0
27 Oct 2021
The Efficiency Misnomer
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
32
98
0
25 Oct 2021
In Search of Probeable Generalization Measures
In Search of Probeable Generalization Measures
Jonathan Jaegerman
Khalil Damouni
M. M. Ankaralı
Konstantinos N. Plataniotis
16
2
0
23 Oct 2021
Signature-Graph Networks
Signature-Graph Networks
Ali Hamdi
Flora D. Salim
D. Kim
Xiaojun Chang
13
1
0
22 Oct 2021
User-friendly introduction to PAC-Bayes bounds
User-friendly introduction to PAC-Bayes bounds
Pierre Alquier
FedML
48
196
0
21 Oct 2021
When in Doubt, Summon the Titans: Efficient Inference with Large Models
When in Doubt, Summon the Titans: Efficient Inference with Large Models
A. S. Rawat
Manzil Zaheer
A. Menon
Amr Ahmed
Sanjiv Kumar
17
7
0
19 Oct 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
119
98
0
16 Oct 2021
Generalization Techniques Empirically Outperform Differential Privacy
  against Membership Inference
Generalization Techniques Empirically Outperform Differential Privacy against Membership Inference
Jiaxiang Liu
Simon Oya
Florian Kerschbaum
MIACV
11
9
0
11 Oct 2021
Self-supervised Learning is More Robust to Dataset Imbalance
Self-supervised Learning is More Robust to Dataset Imbalance
Hong Liu
Jeff Z. HaoChen
Adrien Gaidon
Tengyu Ma
OOD
SSL
31
156
0
11 Oct 2021
Flattening Sharpness for Dynamic Gradient Projection Memory Benefits
  Continual Learning
Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning
Danruo Deng
Guangyong Chen
Jianye Hao
Qiong Wang
Pheng-Ann Heng
CLL
AAML
14
76
0
09 Oct 2021
Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization
  Domain
Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization Domain
Dmitrii Ostrovskii
Babak Barazandeh
Meisam Razaviyayn
16
11
0
08 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural
  Networks
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Joey Tianyi Zhou
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
105
132
0
07 Oct 2021
On the Generalization of Models Trained with SGD: Information-Theoretic
  Bounds and Implications
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications
Ziqiao Wang
Yongyi Mao
FedML
MLT
32
22
0
07 Oct 2021
SIRe-Networks: Convolutional Neural Networks Architectural Extension for
  Information Preservation via Skip/Residual Connections and Interlaced
  Auto-Encoders
SIRe-Networks: Convolutional Neural Networks Architectural Extension for Information Preservation via Skip/Residual Connections and Interlaced Auto-Encoders
D. Avola
Luigi Cinque
Alessio Fagioli
G. Foresti
17
3
0
06 Oct 2021
ResNet strikes back: An improved training procedure in timm
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
209
487
0
01 Oct 2021
Perturbated Gradients Updating within Unit Space for Deep Learning
Perturbated Gradients Updating within Unit Space for Deep Learning
Ching-Hsun Tseng
Liu Cheng
Shin-Jye Lee
Xiaojun Zeng
38
5
0
01 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
81
72
0
29 Sep 2021
Explaining Deep Learning Representations by Tracing the Training Process
Explaining Deep Learning Representations by Tracing the Training Process
Lukas Pfahler
K. Morik
FAtt
8
2
0
13 Sep 2021
Raise a Child in Large Language Model: Towards Effective and
  Generalizable Fine-tuning
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Runxin Xu
Fuli Luo
Zhiyuan Zhang
Chuanqi Tan
Baobao Chang
Songfang Huang
Fei Huang
LRM
145
178
0
13 Sep 2021
Scaled ReLU Matters for Training Vision Transformers
Scaled ReLU Matters for Training Vision Transformers
Pichao Wang
Xue Wang
Haowen Luo
Jingkai Zhou
Zhipeng Zhou
Fan Wang
Hao Li
R. L. Jin
13
41
0
08 Sep 2021
Adaptive Few-Shot Learning PoC Ultrasound COVID-19 Diagnostic System
Adaptive Few-Shot Learning PoC Ultrasound COVID-19 Diagnostic System
Michael Karnes
Shehan Perera
S. Adhikari
Alper Yilmaz
4
7
0
08 Sep 2021
Fishr: Invariant Gradient Variances for Out-of-Distribution
  Generalization
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
Alexandre Ramé
Corentin Dancette
Matthieu Cord
OOD
38
204
0
07 Sep 2021
Adversarial Parameter Defense by Multi-Step Risk Minimization
Adversarial Parameter Defense by Multi-Step Risk Minimization
Zhiyuan Zhang
Ruixuan Luo
Xuancheng Ren
Qi Su
Liangyou Li
Xu Sun
AAML
23
6
0
07 Sep 2021
The Impact of Reinitialization on Generalization in Convolutional Neural
  Networks
The Impact of Reinitialization on Generalization in Convolutional Neural Networks
Ibrahim M. Alabdulmohsin
Hartmut Maennel
Daniel Keysers
AI4CE
21
20
0
01 Sep 2021
Previous
123...15161718
Next