Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.05482
Cited By
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"
50 / 667 papers shown
Title
Tracking Universal Features Through Fine-Tuning and Model Merging
Niels Horn
Desmond Elliott
MoMe
24
0
0
16 Oct 2024
Agent Skill Acquisition for Large Language Models via CycleQD
So Kuroki
Taishi Nakamura
Takuya Akiba
Yujin Tang
MoMe
29
0
0
16 Oct 2024
Overcoming Domain Limitations in Open-vocabulary Segmentation
Dongjun Hwang
Seong Joon Oh
Junsuk Choe
SSeg
OOD
42
0
0
15 Oct 2024
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng
Zifeng Wang
Yike Wang
Sayna Ebrahimi
Hamid Palangi
...
Nathalie Rauschmayr
Yejin Choi
Yulia Tsvetkov
Chen-Yu Lee
Tomas Pfister
MoMe
30
3
0
15 Oct 2024
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
Minghao Zhu
Zhengpu Wang
Mengxian Hu
Ronghao Dang
Xiao Lin
Xun Zhou
Chengju Liu
Qijun Chen
24
1
0
14 Oct 2024
Retrieval Instead of Fine-tuning: A Retrieval-based Parameter Ensemble for Zero-shot Learning
Pengfei Jin
Peng Shu
Sekeun Kim
Qing Xiao
S. Song
Cheng Chen
Tianming Liu
Xiang Li
Quanzheng Li
35
1
0
13 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
35
3
0
12 Oct 2024
CollabEdit: Towards Non-destructive Collaborative Knowledge Editing
Jiamu Zheng
Jinghuai Zhang
Tianyu Du
Xuhong Zhang
Jianwei Yin
Tao Lin
KELM
22
0
0
12 Oct 2024
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation
Thomas Gauthier-Caron
Shamane Siriwardhana
Elliot Stein
Malikeh Ehghaghi
Charles Goddard
Mark McQuade
Jacob Solawetz
Maxime Labonne
MoMe
23
2
0
10 Oct 2024
How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Seongyun Lee
Geewook Kim
Jiyeon Kim
Hyunji Lee
Hoyeon Chang
Sue Hyun Park
Minjoon Seo
31
0
0
10 Oct 2024
PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency
Preferred Elements
:
Kenshin Abe
Kaizaburo Chubachi
Yasuhiro Fujita
...
Yoshihiko Ozaki
Shotaro Sano
Shuji Suzuki
Tianqi Xu
Toshihiko Yanase
26
0
0
10 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Mohit Bansal
Tianlong Chen
MoMe
MoE
25
1
0
09 Oct 2024
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
Qianli Ma
Xuefei Ning
Dongrui Liu
Li Niu
Linfeng Zhang
MoMe
44
0
0
09 Oct 2024
Diversity-Rewarded CFG Distillation
Geoffrey Cideron
A. Agostinelli
Johan Ferret
Sertan Girgin
Romuald Elie
Olivier Bachem
Sarah Perrin
Alexandre Ramé
34
2
0
08 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization
Saqib Javed
Hieu Le
Mathieu Salzmann
OOD
MQ
26
1
0
08 Oct 2024
Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing
Andreas Loukas
Karolis Martinkus
Ed Wagstaff
Kyunghyun Cho
OOD
15
1
0
08 Oct 2024
Hyper Adversarial Tuning for Boosting Adversarial Robustness of Pretrained Large Vision Models
Kangtao Lv
Huangsen Cao
Kainan Tu
Yihuai Xu
Zhimeng Zhang
Xin Ding
Yongwei Wang
MoMe
AAML
VLM
14
1
0
08 Oct 2024
Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes
Tim Schopf
Alexander Blatzheim
Nektarios Machner
Florian Matthes
VLM
18
1
0
08 Oct 2024
Wolf2Pack: The AutoFusion Framework for Dynamic Parameter Fusion
Bowen Tian
Songning Lai
Yutao Yue
MoMe
20
0
0
08 Oct 2024
NegMerge: Consensual Weight Negation for Strong Machine Unlearning
Hyoseo Kim
Dongyoon Han
Junsuk Choe
MoMe
MU
18
1
0
08 Oct 2024
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
Xinyu Zhao
Guoheng Sun
Ruisi Cai
Yukun Zhou
Pingzhi Li
...
Binhang Yuan
Hongyi Wang
Ang Li
Zhangyang Wang
Tianlong Chen
MoMe
ALM
26
2
0
07 Oct 2024
Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models
Theo Putterman
Derek Lim
Yoav Gelberg
Stefanie Jegelka
Haggai Maron
AI4CE
43
5
0
05 Oct 2024
Learning Code Preference via Synthetic Evolution
Jiawei Liu
Thanh Nguyen
Mingyue Shang
Hantian Ding
Xiaopeng Li
Yu Yu
Varun Kumar
Zijian Wang
SyDa
ALM
AAML
26
3
0
04 Oct 2024
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Mohit Bansal
Tsendsuren Munkhdalai
MoMe
44
12
0
04 Oct 2024
Parameter Competition Balancing for Model Merging
Guodong Du
Junlin Lee
Jing Li
Runhua Jiang
Yifei Guo
...
Hanting Liu
S. Goh
Ho-Kin Tang
Daojing He
Min Zhang
MoMe
19
10
0
03 Oct 2024
DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
Changdae Oh
Yixuan Li
Kyungwoo Song
Sangdoo Yun
Dongyoon Han
OOD
MoMe
36
4
0
03 Oct 2024
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Yu Sun
Hua-Hong Wu
Sen Su
MoE
16
0
0
02 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
45
0
0
02 Oct 2024
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Lucas Bandarkar
Benjamin Muller
Pritish Yuvraj
Rui Hou
Nayan Singhal
Hongjiang Lv
Bing-Quan Liu
KELM
LRM
MoMe
30
2
0
02 Oct 2024
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging
Yiming Ju
Ziyi Ni
Xingrun Xing
Zhixiong Zeng
hanyu Zhao
Siqi Fan
Zheng Zhang
MoMe
24
2
0
01 Oct 2024
Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning
Da-Wei Zhou
Zi-Wen Cai
Han-Jia Ye
Lijun Zhang
De-Chuan Zhan
CLL
AI4CE
41
2
0
01 Oct 2024
The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging
Masanori Hirano
Kentaro Imajo
MoMe
24
1
0
30 Sep 2024
HM3: Heterogeneous Multi-Class Model Merging
Stefan Hackmann
MoMe
25
0
0
27 Sep 2024
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou
Xingyu Wu
Jibin Wu
Liang Feng
Kay Chen Tan
MoMe
59
0
0
27 Sep 2024
Towards Diverse Device Heterogeneous Federated Learning via Task Arithmetic Knowledge Integration
Mahdi Morafah
Vyacheslav Kungurtsev
Hojin Chang
C. L. P. Chen
Bill Lin
FedML
24
0
0
27 Sep 2024
The Hard Positive Truth about Vision-Language Compositionality
Amita Kamath
Cheng-Yu Hsieh
Kai-Wei Chang
Ranjay Krishna
CLIP
CoGe
VLM
25
5
0
26 Sep 2024
Scalable Ensemble Diversification for OOD Generalization and Detection
Alexander Rubinstein
Luca Scimeca
Damien Teney
Seong Joon Oh
BDL
OOD
344
1
0
25 Sep 2024
Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks
Roberto Alcover-Couso
Juan C. Sanmiguel
Marcos Escudero-Viñolo
Jose M. Martínez
FedML
MoMe
23
1
0
24 Sep 2024
Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape
Tao Li
Zhengbao He
Yujun Li
Yasheng Wang
Lifeng Shang
X. Huang
49
0
0
22 Sep 2024
LPT++: Efficient Training on Mixture of Long-tailed Experts
Bowen Dong
Pan Zhou
W. Zuo
VLM
34
0
0
17 Sep 2024
Towards understanding evolution of science through language model series
Junjie Dong
Zhuoqi Lyu
Qing Ke
AI4TS
28
0
0
15 Sep 2024
Minimizing Embedding Distortion for Robust Out-of-Distribution Performance
Tom Shaked
Yuval Goldman
Oran Shayer
OODD
18
0
0
11 Sep 2024
Self-Masking Networks for Unsupervised Adaptation
Alfonso Taboada Warmerdam
Mathilde Caron
Yuki M. Asano
29
1
0
11 Sep 2024
POINTS: Improving Your Vision-language Model with Affordable Strategies
Yuan Liu
Zhongyin Zhao
Ziyuan Zhuang
Le Tian
Xiao Zhou
Jie Zhou
VLM
35
5
0
07 Sep 2024
C2F-CHART: A Curriculum Learning Approach to Chart Classification
Nour Shaheen
Tamer Elsharnouby
Marwan Torki
18
2
0
07 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
47
1
0
05 Sep 2024
Erasure Coded Neural Network Inference via Fisher Averaging
Divyansh Jhunjhunwala
Neharika Jali
Gauri Joshi
Shiqiang Wang
MoMe
FedML
21
1
0
02 Sep 2024
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning
Jaeyeon Kim
Jaeyoon Jung
Minjeong Jeon
Sang Hoon Woo
Jinjoo Lee
24
1
0
02 Sep 2024
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
T. Dao
Thuan Hoang Nguyen
T. Le
D. Vu
Khoi Nguyen
Cuong Pham
Anh Tran
DiffM
29
11
0
26 Aug 2024
Improving the Classification Effect of Clinical Images of Diseases for Multi-Source Privacy Protection
Tian Bowen
Xu Zhengyang
Yin Zhihao
Wang Jingying
Yue Yutao
FedML
24
0
0
23 Aug 2024
Previous
1
2
3
4
5
...
12
13
14
Next