Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.05482
Cited By
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"
50 / 667 papers shown
Title
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration
HamidReza Imani
Jiaxin Peng
Peiman Mohseni
Abdolah Amirany
Tarek A. El-Ghazawi
MoE
21
0
0
10 May 2025
Bielik 11B v2 Technical Report
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
22
0
0
05 May 2025
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning
Lingxiao Kong
Cong Yang
Susanne Neufang
Oya Beyan
Zeyd Boukhers
OffRL
22
0
0
05 May 2025
Embedding based retrieval for long tail search queries in ecommerce
Akshay Kekuda
Yuyang Zhang
Arun Udayashankar
RALM
27
0
0
03 May 2025
Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors
Ren-Wei Liang
Chin-Ting Hsu
Chan-Hung Yu
Saransh Agrawal
Shih-Cheng Huang
Shang-Tse Chen
Kuan-Hao Huang
Shao-Hua Sun
76
0
0
27 Apr 2025
Dynamic Fisher-weighted Model Merging via Bayesian Optimization
Sanwoo Lee
Jiahao Liu
Qifan Wang
J. Wang
Xunliang Cai
Yunfang Wu
MoMe
62
0
0
26 Apr 2025
Dream-Box: Object-wise Outlier Generation for Out-of-Distribution Detection
Brian K. S. Isaac-Medina
T. Breckon
OODD
65
0
0
25 Apr 2025
Active Few-Shot Learning for Vertex Classification Starting from an Unlabeled Dataset
Felix Burr
Marcel Hoffmann
A. Scherp
SSL
82
0
0
25 Apr 2025
A Model Zoo on Phase Transitions in Neural Networks
Konstantin Schurholt
Léo Meynent
Yefan Zhou
Haiquan Lu
Yaoqing Yang
Damian Borth
58
0
0
25 Apr 2025
PatientDx: Merging Large Language Models for Protecting Data-Privacy in Healthcare
José G. Moreno
Jesus Lovon
M'Rick Robin-Charlet
Christine Damase-Michel
L. Tamine
MoMe
LM&MA
48
0
0
24 Apr 2025
Parameter-Efficient Checkpoint Merging via Metrics-Weighted Averaging
Shi Jie Yu
Sehyun Choi
MoMe
45
0
0
23 Apr 2025
Param
Δ
Δ
Δ
for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost
Sheng Cao
Mingrui Wu
Karthik Prasad
Yuandong Tian
Zechun Liu
MoMe
74
0
0
23 Apr 2025
Trillion 7B Technical Report
Sungjun Han
Juyoung Suk
Suyeong An
Hyungguk Kim
Kyuseok Kim
Wonsuk Yang
Seungtaek Choi
Jamin Shin
36
0
0
21 Apr 2025
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Ziwen Xu
Shuxun Wang
Kewei Xu
Haoming Xu
Mengru Wang
Xinle Deng
Yunzhi Yao
Guozhou Zheng
H. Chen
Ningyu Zhang
KELM
LLMSV
67
0
0
21 Apr 2025
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Yeoreum Lee
Jinwook Jung
Sungyong Baik
MoMe
40
0
0
20 Apr 2025
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
Yan Yang
Yixia Li
Hongru Wang
Xuetao Wei
Jianqiao Yu
Yun-Nung Chen
Guanhua Chen
MoMe
24
0
0
17 Apr 2025
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai
Sile Hu
Xu Shen
Yonggang Zhang
Xinmei Tian
Jieping Ye
MoMe
42
2
0
15 Apr 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
MoMe
55
2
0
15 Apr 2025
Efficient Multi-Task Modeling through Automated Fusion of Trained Models
Jingxuan Zhou
Weidong Bao
Ji Wang
Zhengyi Zhong
Dayu Zhang
MoMe
31
0
0
14 Apr 2025
The Impact of Model Zoo Size and Composition on Weight Space Learning
Damian Falk
Konstantin Schurholt
Damian Borth
32
0
0
14 Apr 2025
A Model Zoo of Vision Transformers
Damian Falk
Léo Meynent
Florence Pfammatter
Konstantin Schurholt
Damian Borth
32
0
0
14 Apr 2025
Metropolis-Hastings Captioning Game: Knowledge Fusion of Vision Language Models via Decentralized Bayesian Inference
Yuta Matsui
Ryosuke Yamaki
Ryo Ueda
Seitaro Shinagawa
Tadahiro Taniguchi
MLLM
31
1
0
13 Apr 2025
Exploring Synergistic Ensemble Learning: Uniting CNNs, MLP-Mixers, and Vision Transformers to Enhance Image Classification
Mk Bashar
Ocean Monjur
Samia Islam
Mohammad Galib Shams
Niamul Quader
UQCV
29
0
0
12 Apr 2025
FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion
Longguang Zhong
Fanqi Wan
Ziyi Yang
Guosheng Liang
Tianyuan Shi
Xiaojun Quan
MoMe
57
0
0
09 Apr 2025
FedMerge: Federated Personalization via Model Merging
Shutong Chen
Tianyi Zhou
Guodong Long
Jing Jiang
Chengqi Zhang
FedML
MoMe
47
0
0
09 Apr 2025
Exact Unlearning of Finetuning Data via Model Merging at Scale
Kevin Kuo
Amrith Rajagopal Setlur
Kartik Srinivas
Aditi Raghunathan
Virginia Smith
MoMe
CLL
MU
45
0
0
06 Apr 2025
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Leonardo Iurada
Marco Ciccone
Tatiana Tommasi
KELM
MoMe
40
0
0
03 Apr 2025
BECAME: BayEsian Continual Learning with Adaptive Model MErging
Mei Li
Yuxiang Lu
Qinyan Dai
Suizhi Huang
Yue Ding
Hongtao Lu
CLL
MoMe
44
0
0
03 Apr 2025
Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach
Francesco P. Ramunno
Paolo Massa
Vitaliy Kinakh
Brandon Panos
A. Csillaghy
S. Voloshynovskiy
DiffM
53
0
0
31 Mar 2025
Reinforced Model Merging
J. N. Han
Jingwen Ye
Shunyu Liu
Haofei Zhang
Jie Song
Zunlei Feng
Mingli Song
MoMe
55
0
0
27 Mar 2025
Fusion of Graph Neural Networks via Optimal Transport
Weronika Ormaniec
Michael Vollenweider
Elisa Hoskovec
MoMe
FedML
OT
65
0
0
27 Mar 2025
Model Assembly Learning with Heterogeneous Layer Weight Merging
Yi-Kai Zhang
Jin Wang
Xu-Xiang Zhong
De-Chuan Zhan
Han-Jia Ye
MoMe
42
0
0
27 Mar 2025
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation
Reza Qorbani
Gianluca Villani
Theodoros Panagiotakopoulos
Marc Botet Colomer
Linus Harenstam-Nielsen
...
Pier Luigi Dovesi
Jussi Karlgren
Daniel Cremers
F. Tombari
Matteo Poggi
VLM
39
0
0
27 Mar 2025
Unlocking the Value of Decentralized Data: A Federated Dual Learning Approach for Model Aggregation
Junyi Zhu
Ruicong Yao
Taha Ceritli
Savas Ozkan
Matthew B. Blaschko
Eunchung Noh
Jeongwon Min
Cho Jung Min
Mete Ozay
FedML
96
0
0
26 Mar 2025
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging
Han Wu
Yuxuan Yao
Shuqi Liu
Zehua Liu
Xiaojin Fu
Xiongwei Han
X. Li
Hui-Ling Zhen
Tao Zhong
Mingxuan Yuan
MoMe
LRM
75
4
0
26 Mar 2025
Efficient Model Development through Fine-tuning Transfer
Pin-Jie Lin
Rishab Balasubramanian
Fengyuan Liu
Nikhil Kandpal
Tu Vu
57
0
0
25 Mar 2025
PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models
Junhyuk So
Jiwoong Shin
Chaeyeon Jang
Eunhyeok Park
DiffM
46
0
0
25 Mar 2025
Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization
Xiran Wang
Jian Zhang
Lei Qi
Yinghuan Shi
50
0
0
23 Mar 2025
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Robin Hesse
Doğukan Bağcı
Bernt Schiele
Simone Schaub-Meyer
Stefan Roth
VLM
54
0
0
21 Mar 2025
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors
Changlong Shi
He Zhao
Bingjie Zhang
Mingyuan Zhou
Dandan Guo
Yi Chang
37
0
0
20 Mar 2025
When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach
Vaibhav Rathore
S. Bagchi
Saikat Dutta
Sarthak Mehrotra
Zsolt Kira
Biplab Banerjee
OOD
74
1
0
19 Mar 2025
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation
Yihong Luo
Tianyang Hu
Weijian Luo
Kenji Kawaguchi
Jing Tang
EGVM
70
0
0
17 Mar 2025
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
Hao Chen
S. Hu
Wayne Luk
Timothy M. Hospedales
Hongxiang Fan
MoMe
67
0
0
16 Mar 2025
Enhanced Soups for Graph Neural Networks
Joseph Zuber
Aishwarya Sarkar
Joseph Jennings
Ali Jannesari
40
0
0
14 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
38
0
0
13 Mar 2025
From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches
Wei Ruan
Tianze Yang
Y. Zhou
Tianming Liu
Jin Lu
MoMe
88
0
0
13 Mar 2025
Charting and Navigating Hugging Face's Model Atlas
Eliahu Horwitz
Nitzan Kurer
Jonathan Kahana
Liel Amar
Yedid Hoshen
31
0
0
13 Mar 2025
ForAug: Recombining Foregrounds and Backgrounds to Improve Vision Transformer Training with Bias Mitigation
Tobias Christian Nauen
Brian B. Moser
Federico Raue
Stanislav Frolov
Andreas Dengel
ViT
50
0
0
12 Mar 2025
Robust Multi-Objective Controlled Decoding of Large Language Models
Seongho Son
William Bankes
Sangwoong Yoon
Shyam Sundhar Ramesh
Xiaohang Tang
Ilija Bogunovic
39
0
0
11 Mar 2025
Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation
Mingkang Zhu
Xi Chen
Z. Wang
Bei Yu
Hengshuang Zhao
Jiaya Jia
MoMe
50
0
0
11 Mar 2025
1
2
3
4
...
12
13
14
Next