Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05407
Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Averaging Weights Leads to Wider Optima and Better Generalization"
50 / 295 papers shown
Title
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration
HamidReza Imani
Jiaxin Peng
Peiman Mohseni
Abdolah Amirany
Tarek A. El-Ghazawi
MoE
26
0
0
10 May 2025
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
Young-Hu Park
R.-H. Park
Hyung-Min Park
49
0
0
07 May 2025
Transferable Adversarial Attacks on Black-Box Vision-Language Models
Kai Hu
Weichen Yu
L. Zhang
Alexander Robey
Andy Zou
Chengming Xu
Haoqi Hu
Matt Fredrikson
AAML
VLM
54
0
0
02 May 2025
CAMeL: Cross-modality Adaptive Meta-Learning for Text-based Person Retrieval
Hang Yu
Jiahao Wen
Zhedong Zheng
46
0
0
26 Apr 2025
A Model Zoo on Phase Transitions in Neural Networks
Konstantin Schurholt
Léo Meynent
Yefan Zhou
Haiquan Lu
Yaoqing Yang
Damian Borth
63
0
0
25 Apr 2025
Param
Δ
Δ
Δ
for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost
Sheng Cao
Mingrui Wu
Karthik Prasad
Yuandong Tian
Zechun Liu
MoMe
80
0
0
23 Apr 2025
Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections
Max Kirchner
Alexander C. Jenke
S. Bodenstedt
F. Kolbinger
Oliver Saldanha
Jakob N. Kather
M. Wagner
Stefanie Speidel
FedML
MedIm
69
0
0
23 Apr 2025
Weakly Semi-supervised Whole Slide Image Classification by Two-level Cross Consistency Supervision
Linhao Qu
Shiman Li
Xiaoyuan Luo
Shaolei Liu
Qinhao Guo
Manning Wang
Zhijian Song
39
0
0
16 Apr 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
MoMe
63
2
0
15 Apr 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
187
0
0
26 Mar 2025
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
Hao Chen
S. Hu
Wayne Luk
Timothy M. Hospedales
Hongxiang Fan
MoMe
72
0
0
16 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
76
0
0
07 Mar 2025
Gradient-Guided Annealing for Domain Generalization
Aristotelis Ballas
Christos Diou
OOD
139
0
0
27 Feb 2025
HALO: Robust Out-of-Distribution Detection via Joint Optimisation
Hugo Lyons Keenan
S. Erfani
Christopher Leckie
OODD
199
0
0
27 Feb 2025
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Qiuming Zhao
Guangzhi Sun
Chao Zhang
Mingxing Xu
Thomas Fang Zheng
MoMe
VLM
134
0
0
24 Feb 2025
Robust Concept Erasure Using Task Vectors
Minh Pham
Kelly O. Marshall
Chinmay Hegde
Niv Cohen
115
17
0
21 Feb 2025
LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection
Yihe Wang
Nan Huang
Nadia Mammone
Marco Cecchi
Xiang Zhang
72
0
0
02 Feb 2025
Task Arithmetic in Trust Region: A Training-Free Model Merging Approach to Navigate Knowledge Conflicts
Wenju Sun
Qingyong Li
Wen Wang
Yangli-ao Geng
Boyang Li
36
2
0
28 Jan 2025
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers
Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Siyuan Lu
Yaliang Li
Ji-Rong Wen
LRM
51
13
0
28 Jan 2025
Training-free Heterogeneous Model Merging
Zhengqi Xu
Han Zheng
Jie Song
Li Sun
Mingli Song
MoMe
68
1
0
03 Jan 2025
Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging
Ali Bahri
Moslem Yazdanpanah
Mehrdad Noori
Sahar Dastani Oghani
Milad Cheraghalikhani
David Osowiech
Farzad Beizaee
G. A. V. Hakim
Ismail ben Ayed
Christian Desrosiers
3DPC
TTA
43
0
0
31 Dec 2024
Two Heads Are Better Than One: Averaging along Fine-Tuning to Improve Targeted Transferability
Hui Zeng
Sanshuai Cui
Biwei Chen
Anjie Peng
AAML
32
0
0
31 Dec 2024
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Aodi Li
Liansheng Zhuang
Xiao Long
Minghong Yao
Shafei Wang
172
0
0
18 Dec 2024
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
72
8
0
29 Oct 2024
Subgraph Aggregation for Out-of-Distribution Generalization on Graphs
Bowen Liu
Haoyang Li
Shuning Wang
Shuo Nie
Shanghang Zhang
OODD
CML
64
0
0
29 Oct 2024
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Jinluan Yang
A. Tang
Didi Zhu
Zhengyu Chen
Li Shen
Fei Wu
MoMe
AAML
52
3
0
17 Oct 2024
Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics
Daniel Paulin
P. Whalley
Neil K. Chada
B. Leimkuhler
BDL
41
4
0
14 Oct 2024
Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense
Rui Min
Zeyu Qin
Nevin L. Zhang
Li Shen
Minhao Cheng
AAML
31
4
0
13 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization
Saqib Javed
Hieu Le
Mathieu Salzmann
OOD
MQ
26
1
0
08 Oct 2024
Wolf2Pack: The AutoFusion Framework for Dynamic Parameter Fusion
Bowen Tian
Songning Lai
Yutao Yue
MoMe
30
0
0
08 Oct 2024
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Lucas Bandarkar
Benjamin Muller
Pritish Yuvraj
Rui Hou
Nayan Singhal
Hongjiang Lv
Bing-Quan Liu
KELM
LRM
MoMe
42
3
0
02 Oct 2024
Mastering Chess with a Transformer Model
Daniel Monroe
The Leela Chess Zero Team
19
3
0
18 Sep 2024
Deep Companion Learning: Enhancing Generalization Through Historical Consistency
Ruizhao Zhu
Venkatesh Saligrama
FedML
23
0
0
26 Jul 2024
ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment
Xinyi Wang
Angeliki V. Katsenou
David Bull
36
0
0
16 Jul 2024
PartImageNet++ Dataset: Scaling up Part-based Models for Robust Recognition
Xiao-Li Li
Yining Liu
Na Dong
Sitian Qin
Xiaolin Hu
36
3
0
15 Jul 2024
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
Nikolaos Dimitriadis
Pascal Frossard
F. Fleuret
MoE
59
6
0
10 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
38
18
0
08 Jul 2024
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Stefan Horoi
Albert Manuel Orozco Camacho
Eugene Belilovsky
Guy Wolf
FedML
MoMe
27
9
0
07 Jul 2024
Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization
Hongjun Choi
Jayaraman J. Thiagarajan
Ruben Glatt
Shusen Liu
40
0
0
29 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
50
13
0
24 Jun 2024
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images
Parastoo Sotoudeh Sharifi
M. Omair Ahmad
M. N. S. Swamy
MoMe
OOD
34
0
0
21 Jun 2024
MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization
Zhaozhe Hu
Jia-Li Yin
Bin Chen
Luojun Lin
Bo-Hao Chen
Ximeng Liu
AAML
28
0
0
20 Jun 2024
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Jiaxin Deng
Junbiao Pang
Baochang Zhang
64
1
0
12 Jun 2024
Merging Improves Self-Critique Against Jailbreak Attacks
Victor Gallego
AAML
MoMe
36
3
0
11 Jun 2024
Scalable Bayesian Learning with posteriors
Samuel Duffield
Kaelan Donatella
Johnathan Chiu
Phoebe Klett
Daniel Simpson
BDL
UQCV
59
1
0
31 May 2024
Towards Unified Robustness Against Both Backdoor and Adversarial Attacks
Zhenxing Niu
Yuyao Sun
Qiguang Miao
Rong Jin
Gang Hua
AAML
34
6
0
28 May 2024
Uniformly Stable Algorithms for Adversarial Training and Beyond
Jiancong Xiao
Jiawei Zhang
Zhimin Luo
Asuman Ozdaglar
AAML
40
0
0
03 May 2024
Fairness Without Demographics in Human-Centered Federated Learning
Shaily Roy
Harshit Sharma
Asif Salekin
38
2
0
30 Apr 2024
Generalization Measures for Zero-Shot Cross-Lingual Transfer
Saksham Bassi
Duygu Ataman
Kyunghyun Cho
29
0
0
24 Apr 2024
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models
Nastaran Saadati
Minh Pham
Nasla Saleem
Joshua R. Waite
Aditya Balu
Zhanhong Jiang
Chinmay Hegde
Soumik Sarkar
MoMe
35
1
0
11 Apr 2024
1
2
3
4
5
6
Next