Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05407
Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Averaging Weights Leads to Wider Optima and Better Generalization"
50 / 305 papers shown
Title
Randomized Adversarial Training via Taylor Expansion
Gao Jin
Xinping Yi
Dengyu Wu
Ronghui Mu
Xiaowei Huang
AAML
31
34
0
19 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Huanran Chen
Yichi Zhang
Yinpeng Dong
Xiao Yang
Hang Su
Junyi Zhu
AAML
26
55
0
16 Mar 2023
CAT: Causal Audio Transformer for Audio Classification
Xiaoyu Liu
Hanlin Lu
Jianbo Yuan
Xinyu Li
ViT
24
22
0
14 Mar 2023
Rethinking Confidence Calibration for Failure Prediction
Fei Zhu
Zhen Cheng
Xu-Yao Zhang
Cheng-Lin Liu
UQCV
14
39
0
06 Mar 2023
DSD
2
^2
2
: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
24
7
0
02 Mar 2023
Average of Pruning: Improving Performance and Stability of Out-of-Distribution Detection
Zhen Cheng
Fei Zhu
Xu-Yao Zhang
Cheng-Lin Liu
MoMe
OODD
40
11
0
02 Mar 2023
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain
Sravanti Addepalli
P. Sahu
Priyam Dey
R. Venkatesh Babu
MoMe
OOD
35
20
0
28 Feb 2023
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking
Chang-Shu Liu
Yinpeng Dong
Wenzhao Xiang
X. Yang
Hang Su
Junyi Zhu
YueFeng Chen
Yuan He
H. Xue
Shibao Zheng
OOD
VLM
AAML
17
72
0
28 Feb 2023
Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning
Van Tuan Tran
Huy Hieu Pham
Kok-Seng Wong
FedML
25
7
0
22 Feb 2023
Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks
Mohamed Aziz Bhouri
M. Joly
Robert Yu
S. Sarkar
P. Perdikaris
BDL
UQCV
AI4CE
11
1
0
14 Feb 2023
Contour-based Interactive Segmentation
Danil Galeev
Polina Popenova
Anna Vorontsova
Anton Konushin
22
5
0
13 Feb 2023
Making Substitute Models More Bayesian Can Enhance Transferability of Adversarial Examples
Qizhang Li
Yiwen Guo
W. Zuo
Hao Chen
AAML
27
35
0
10 Feb 2023
Better Diffusion Models Further Improve Adversarial Training
Zekai Wang
Tianyu Pang
Chao Du
Min-Bin Lin
Weiwei Liu
Shuicheng Yan
DiffM
16
207
0
09 Feb 2023
Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion
Haotian Ju
Dongyue Li
Aneesh Sharma
Hongyang R. Zhang
23
40
0
09 Feb 2023
A Survey of Deep Learning: From Activations to Transformers
Johannes Schneider
Michalis Vlachos
ViT
MedIm
AI4TS
AI4CE
46
9
0
01 Feb 2023
Cross-Architectural Positive Pairs improve the effectiveness of Self-Supervised Learning
P. Singh
Jacopo Cirrone
SSL
40
0
0
27 Jan 2023
Exploring the Effect of Multi-step Ascent in Sharpness-Aware Minimization
Hoki Kim
Jinseong Park
Yujin Choi
Woojin Lee
Jaewook Lee
15
9
0
27 Jan 2023
Model soups to increase inference without increasing compute time
Charles Dansereau
Milo Sobral
Maninder Bhogal
Mehdi Zalai
16
2
0
24 Jan 2023
Stability Analysis of Sharpness-Aware Minimization
Hoki Kim
Jinseong Park
Yujin Choi
Jaewook Lee
28
12
0
16 Jan 2023
Training trajectories, mini-batch losses and the curious role of the learning rate
Mark Sandler
A. Zhmoginov
Max Vladymyrov
Nolan Miller
ODL
13
10
0
05 Jan 2023
Do Bayesian Variational Autoencoders Know What They Don't Know?
Misha Glazunov
Apostolis Zarras
UQCV
BDL
20
5
0
29 Dec 2022
Training Integer-Only Deep Recurrent Neural Networks
V. Nia
Eyyub Sari
Vanessa Courville
M. Asgharian
MQ
42
2
0
22 Dec 2022
KL Regularized Normalization Framework for Low Resource Tasks
Neeraj Kumar
Ankur Narang
Brejesh Lall
21
1
0
21 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
13
211
0
19 Dec 2022
The Underlying Correlated Dynamics in Neural Training
Rotem Turjeman
Tom Berkov
I. Cohen
Guy Gilboa
19
3
0
18 Dec 2022
Bayesian posterior approximation with stochastic ensembles
Oleksandr Balabanov
Bernhard Mehlig
H. Linander
BDL
UQCV
27
5
0
15 Dec 2022
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Ashwinee Panda
Xinyu Tang
Saeed Mahloujifar
Vikash Sehwag
Prateek Mittal
31
11
0
08 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
43
424
0
08 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
26
52
0
02 Dec 2022
BARTSmiles: Generative Masked Language Models for Molecular Representations
Gayane Chilingaryan
Hovhannes Tamoyan
Ani Tevosyan
N. Babayan
L. Khondkaryan
Karen Hambardzumyan
Zaven Navoyan
Hrant Khachatrian
Armen Aghajanyan
SSL
27
25
0
29 Nov 2022
Indian Commercial Truck License Plate Detection and Recognition for Weighbridge Automation
Siddharth Agrawal
Keyur D. Joshi
30
4
0
23 Nov 2022
Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization
Zifa Wang
Nan Ding
Tomer Levinboim
Xi Chen
Radu Soricut
AAML
35
5
0
22 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
46
94
0
15 Nov 2022
Learning to Annotate Part Segmentation with Gradient Matching
Yu Yang
Xiaotian Cheng
Hakan Bilen
Xiangyang Ji
GAN
19
7
0
06 Nov 2022
Quantifying Model Uncertainty for Semantic Segmentation using Operators in the RKHS
Rishabh Singh
José C. Príncipe
UQCV
19
3
0
03 Nov 2022
Circling Back to Recurrent Models of Language
Gábor Melis
27
0
0
03 Nov 2022
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Yaqing Wang
Sahaj Agarwal
Subhabrata Mukherjee
Xiaodong Liu
Jing Gao
Ahmed Hassan Awadallah
Jianfeng Gao
MoE
13
118
0
31 Oct 2022
Symmetries, flat minima, and the conserved quantities of gradient flow
Bo-Lu Zhao
I. Ganev
Robin G. Walters
Rose Yu
Nima Dehmamy
42
16
0
31 Oct 2022
Towards Generalized Few-Shot Open-Set Object Detection
Binyi Su
Hua Zhang
Jingzhi Li
Zhongjun Zhou
43
9
0
28 Oct 2022
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition
Steven Vander Eeckt
Hugo Van hamme
CLL
MoMe
58
14
0
27 Oct 2022
Sufficient Invariant Learning for Distribution Shift
Taero Kim
Sungjun Lim
Kyungwoo Song
OOD
19
2
0
24 Oct 2022
On the optimization and pruning for Bayesian deep learning
X. Ke
Yanan Fan
BDL
UQCV
22
1
0
24 Oct 2022
Revisiting Checkpoint Averaging for Neural Machine Translation
Yingbo Gao
Christian Herold
Zijian Yang
Hermann Ney
MoMe
23
11
0
21 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
19
24
0
19 Oct 2022
Scaling Adversarial Training to Large Perturbation Bounds
Sravanti Addepalli
Samyak Jain
Gaurang Sriramanan
R. Venkatesh Babu
AAML
25
22
0
18 Oct 2022
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models
Nikolaos Dimitriadis
P. Frossard
Franccois Fleuret
16
25
0
18 Oct 2022
RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
A. Jaiswal
Kumar Ashutosh
Justin F. Rousseau
Yifan Peng
Zhangyang Wang
Ying Ding
13
9
0
15 Oct 2022
Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks
A. K. Akash
Sixu Li
Nicolas García Trillos
24
12
0
13 Oct 2022
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Haw-Shiuan Chang
Ruei-Yao Sun
Kathryn Ricci
Andrew McCallum
41
14
0
10 Oct 2022
Learning Across Domains and Devices: Style-Driven Source-Free Domain Adaptation in Clustered Federated Learning
Donald Shenaj
Eros Fani
Marco Toldo
Debora Caldarola
A. Tavera
Umberto Michieli
Marco Ciccone
Pietro Zanuttigh
Barbara Caputo
FedML
21
39
0
05 Oct 2022
Previous
1
2
3
4
5
6
7
Next