Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.01412
Cited By
Sharpness-Aware Minimization for Efficiently Improving Generalization
3 October 2020
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sharpness-Aware Minimization for Efficiently Improving Generalization"
50 / 867 papers shown
Title
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
62
19
0
25 May 2022
TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch Models
A. Engel
Zhichao Wang
Anand D. Sarwate
Sutanay Choudhury
Tony Chiang
22
3
0
24 May 2022
Alleviating Robust Overfitting of Adversarial Training With Consistency Regularization
Shudong Zhang
Haichang Gao
Tianwei Zhang
Yunyi Zhou
Zihui Wu
AAML
18
3
0
24 May 2022
Training Efficient CNNS: Tweaking the Nuts and Bolts of Neural Networks for Lighter, Faster and Robust Models
Sabeesh Ethiraj
B. Bolla
17
2
0
23 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
31
11
0
21 May 2022
Temporally Precise Action Spotting in Soccer Videos Using Dense Detection Anchors
J. C. V. Soares
Avijit Shah
Topojoy Biswas
35
32
0
20 May 2022
Diverse Weight Averaging for Out-of-Distribution Generalization
Alexandre Ramé
Matthieu Kirchmeyer
Thibaud Rahier
A. Rakotomamonjy
Patrick Gallinari
Matthieu Cord
OOD
196
128
0
19 May 2022
Analyzing Lottery Ticket Hypothesis from PAC-Bayesian Theory Perspective
Keitaro Sakamoto
Issei Sato
28
9
0
15 May 2022
Discovering and Explaining the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions
Fang Wu
Siyuan Li
Lirong Wu
Dragomir R. Radev
Stan Z. Li
27
2
0
15 May 2022
Goldilocks-curriculum Domain Randomization and Fractal Perlin Noise with Application to Sim2Real Pneumonia Lesion Detection
Takahiro Suzuki
S. Hanaoka
Issei Sato
OOD
MedIm
26
1
0
29 Apr 2022
Detecting Deepfakes with Self-Blended Images
Kaede Shiohara
T. Yamasaki
26
291
0
18 Apr 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
72
6,637
0
13 Apr 2022
Few-Shot Forecasting of Time-Series with Heterogeneous Channels
L. Brinkmeyer
Rafael Rêgo Drumond
Johannes Burchert
Lars Schmidt-Thieme
AI4TS
22
7
0
07 Apr 2022
Exploiting Explainable Metrics for Augmented SGD
Mahdi S. Hosseini
Mathieu Tuli
Konstantinos N. Plataniotis
AAML
14
3
0
31 Mar 2022
Frame-level Prediction of Facial Expressions, Valence, Arousal and Action Units for Mobile Devices
Andrey V. Savchenko
CVBM
15
30
0
25 Mar 2022
ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator
Zi-Chao Zhang
Zhen-Duo Chen
Yongxin Wang
Xin Luo
Xin-Shun Xu
ViT
22
6
0
24 Mar 2022
Improving Generalization in Federated Learning by Seeking Flat Minima
Debora Caldarola
Barbara Caputo
Marco Ciccone
FedML
27
110
0
22 Mar 2022
The activity-weight duality in feed forward neural networks: The geometric determinants of generalization
Yu Feng
Yuhai Tu
MLT
75
14
0
21 Mar 2022
Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
16
9
0
18 Mar 2022
DeepAD: A Robust Deep Learning Model of Alzheimer's Disease Progression for Real-World Clinical Applications
Somaye Hashemifar
C. Iriondo
Evan Casey
Mohsen Hejrati
for Alzheimer's Disease Neuroimaging Initiative
OOD
MedIm
20
3
0
17 Mar 2022
A New Quantum CNN Model for Image Classification
Xing-Qiang Zhao
Tianlong Chen
9
0
0
16 Mar 2022
Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective
Gowthami Somepalli
Liam H. Fowl
Arpit Bansal
Ping Yeh-Chiang
Yehuda Dar
Richard Baraniuk
Micah Goldblum
Tom Goldstein
13
64
0
15 Mar 2022
Surrogate Gap Minimization Improves Sharpness-Aware Training
Juntang Zhuang
Boqing Gong
Liangzhe Yuan
Yin Cui
Hartwig Adam
Nicha Dvornek
S. Tatikonda
James Duncan
Ting Liu
22
146
0
15 Mar 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
19
166
0
11 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
54
914
1
10 Mar 2022
Adaptor: Objective-Centric Adaptation Framework for Language Models
Michal vStefánik
Vít Novotný
Nikola Groverová
Petr Sojka
27
10
0
08 Mar 2022
Flat minima generalize for low-rank matrix recovery
Lijun Ding
D. Drusvyatskiy
Maryam Fazel
Zaid Harchaoui
26
16
0
07 Mar 2022
β
β
β
-DARTS: Beta-Decay Regularization for Differentiable Architecture Search
Peng Ye
Baopu Li
Yikang Li
Tao Chen
Jiayuan Fan
Wanli Ouyang
13
101
0
03 Mar 2022
Color Space-based HoVer-Net for Nuclei Instance Segmentation and Classification
Hussam Azzuni
Muhammad Ridzuan
Min Xu
Mohammad Yaqub
38
6
0
03 Mar 2022
Towards Class-agnostic Tracking Using Feature Decorrelation in Point Clouds
Shengjing Tian
Jun Liu
Xiuping Liu
3DPC
27
4
0
28 Feb 2022
Adversarial robustness of sparse local Lipschitz predictors
Ramchandran Muthukumar
Jeremias Sulam
AAML
32
13
0
26 Feb 2022
Tackling benign nonconvexity with smoothing and stochastic gradients
Harsh Vardhan
Sebastian U. Stich
20
8
0
18 Feb 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
32
465
0
14 Feb 2022
Parametric t-Stochastic Neighbor Embedding With Quantum Neural Network
Yoshiaki Kawase
K. Mitarai
Keisuke Fujii
26
5
0
09 Feb 2022
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
30
116
0
08 Feb 2022
Towards an Analytical Definition of Sufficient Data
Adam Byerly
T. Kalganova
27
4
0
07 Feb 2022
Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry
Fabrizio Pittorino
Antonio Ferraro
Gabriele Perugini
Christoph Feinauer
Carlo Baldassi
R. Zecchina
201
24
0
07 Feb 2022
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data
Yaoqing Yang
Ryan Theisen
Liam Hodgkinson
Joseph E. Gonzalez
Kannan Ramchandran
Charles H. Martin
Michael W. Mahoney
86
17
0
06 Feb 2022
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models
Chen Liang
Haoming Jiang
Simiao Zuo
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
T. Zhao
17
14
0
06 Feb 2022
Learning strides in convolutional neural networks
Rachid Riad
O. Teboul
David Grangier
Neil Zeghidour
30
41
0
03 Feb 2022
Deep Hierarchy in Bandits
Joey Hong
B. Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
25
20
0
03 Feb 2022
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
11
58
0
01 Feb 2022
Fortuitous Forgetting in Connectionist Networks
Hattie Zhou
Ankit Vani
Hugo Larochelle
Aaron Courville
CLL
11
42
0
01 Feb 2022
ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language Models via Efficient Large-Batch Adversarial Noise
Minjia Zhang
U. Niranjan
Yuxiong He
23
1
0
29 Jan 2022
Weight Expansion: A New Perspective on Dropout and Generalization
Gao Jin
Xinping Yi
Pengfei Yang
Lijun Zhang
S. Schewe
Xiaowei Huang
29
5
0
23 Jan 2022
Learning to Minimize the Remainder in Supervised Learning
Yan Luo
Yongkang Wong
Mohan S. Kankanhalli
Qi Zhao
44
1
0
23 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Neighborhood Region Smoothing Regularization for Finding Flat Minima In Deep Neural Networks
Yang Zhao
Hao Zhang
22
1
0
16 Jan 2022
There is a Singularity in the Loss Landscape
M. Lowell
14
0
0
12 Jan 2022
Communication-Efficient Federated Learning with Accelerated Client Gradient
Geeho Kim
Jinkyu Kim
Bohyung Han
FedML
32
11
0
10 Jan 2022
Previous
1
2
3
...
14
15
16
17
18
Next