ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.05482
  4. Cited By
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
    MoMe
ArXivPDFHTML

Papers citing "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"

50 / 667 papers shown
Title
Robust Multi-Objective Controlled Decoding of Large Language Models
Seongho Son
William Bankes
Sangwoong Yoon
Shyam Sundhar Ramesh
Xiaohang Tang
Ilija Bogunovic
39
0
0
11 Mar 2025
Self-supervised Normality Learning and Divergence Vector-guided Model Merging for Zero-shot Congenital Heart Disease Detection in Fetal Ultrasound Videos
Pramit Saha
Divyanshu Mishra
Netzahualcoyotl Hernandez-Cruz
Olga Patey
A. Papageorghiou
Yuki M. Asano
J. A. Noble
38
0
0
10 Mar 2025
Task Vector Quantization for Memory-Efficient Model Merging
Youngeun Kim
Seunghwan Lee
Aecheon Jung
Bogon Ryu
Sungeun Hong
MQ
MoMe
52
0
0
10 Mar 2025
Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
C. Huang
Peng Ye
X. Wang
Shenghe Zheng
Biqing Qi
Lei Bai
Wanli Ouyang
Tao Chen
31
0
0
09 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
Liang Lin
LRM
52
2
0
08 Mar 2025
Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy
Wei Junhao
Yu Zhe
Sakuma Jun
AAML
MoMe
54
0
0
08 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
73
0
0
07 Mar 2025
Extrapolation Merging: Keep Improving With Extrapolation and Merging
Yiguan Lin
Bin Xu
Yinghao Li
Yang Gao
MoMe
52
1
0
05 Mar 2025
ReaderLM-v2: Small Language Model for HTML to Markdown and JSON
Feng Wang
Zesheng Shi
Bo Wang
Nan Wang
Han Xiao
RALM
70
1
0
03 Mar 2025
Multi-Level Collaboration in Model Merging
Qi Li
Runpeng Yu
Xinchao Wang
MoMe
FedML
86
0
0
03 Mar 2025
Deep Learning is Not So Mysterious or Different
Andrew Gordon Wilson
36
1
0
03 Mar 2025
Rethinking Data: Towards Better Performing Domain-Specific Small Language Models
Boris Nazarov
Darya Frolova
Yackov Lubarsky
Alexei Gaissinski
Pavel Kisilev
ALM
56
1
0
03 Mar 2025
Med-LEGO: Editing and Adapting toward Generalist Medical Image Diagnosis
Yitao Zhu
Yuan Yin
Jiaming Li
Mengjie Xu
Zihao Zhao
Honglin Xiong
Sheng Wang
Qian Wang
MedIm
65
0
0
03 Mar 2025
Efficiently Editing Mixture-of-Experts Models with Compressed Experts
Y. He
Yang Liu
Chen Liang
Hany Awadalla
MoE
55
1
0
01 Mar 2025
Robust Multi-Objective Preference Alignment with Online DPO
Raghav Gupta
Ryan Sullivan
Yunxuan Li
Samrat Phatale
Abhinav Rastogi
32
0
0
01 Mar 2025
BadJudge: Backdoor Vulnerabilities of LLM-as-a-Judge
Terry Tong
Fei-Yue Wang
Zhe Zhao
M. Chen
AAML
ELM
37
1
0
01 Mar 2025
In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models
In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models
Hu Wang
Ibrahim Almakky
Congbo Ma
Numan Saeed
Mohammad Yaqub
MoMe
64
0
0
27 Feb 2025
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
Yan-Lun Chen
Yi-Ru Wei
Chia-Yi Hsu
Chia-Mu Yu
Chun-ying Huang
Ying-Dar Lin
Yu-Sung Wu
Wei-Bin Lee
MoMe
KELM
48
0
0
27 Feb 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
48
0
0
26 Feb 2025
Enhancing Image Classification with Augmentation: Data Augmentation Techniques for Improved Image Classification
Enhancing Image Classification with Augmentation: Data Augmentation Techniques for Improved Image Classification
Saorj Kumar
Prince Asiamah
Oluwatoyin Jolaoso
Ugochukwu Esiowu
55
0
0
25 Feb 2025
Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems
Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems
Matthew Barker
Andrew Bell
Evan Thomas
James Carr
Thomas Andrews
Umang Bhatt
80
1
0
25 Feb 2025
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Qianli Ma
Dongrui Liu
Qian Chen
Linfeng Zhang
Jing Shao
MoMe
67
0
0
24 Feb 2025
Low-rank bias, weight decay, and model merging in neural networks
Ilja Kuzborskij
Yasin Abbasi-Yadkori
47
0
0
24 Feb 2025
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
Yue Zhou
Yi-Ju Chang
Yuan Wu
MoMe
57
2
0
24 Feb 2025
PICASO: Permutation-Invariant Context Composition with State Space Models
PICASO: Permutation-Invariant Context Composition with State Space Models
Tian Yu Liu
Alessandro Achille
Matthew Trager
Aditya Golatkar
L. Zancato
Stefano Soatto
LRM
58
0
0
24 Feb 2025
Dynamic LLM Routing and Selection based on User Preferences: Balancing Performance, Cost, and Ethics
Dynamic LLM Routing and Selection based on User Preferences: Balancing Performance, Cost, and Ethics
Deepak Babu Piskala
Vijay Raajaa
Sachin Mishra
Bruno Bozza
37
1
0
23 Feb 2025
MedForge: Building Medical Foundation Models Like Open Source Software Development
MedForge: Building Medical Foundation Models Like Open Source Software Development
Zheling Tan
Kexin Ding
Jin Gao
Mu Zhou
Dimitris N. Metaxas
Shaoting Zhang
Dequan Wang
AI4CE
45
1
0
22 Feb 2025
Recurrent Knowledge Identification and Fusion for Language Model Continual Learning
Recurrent Knowledge Identification and Fusion for Language Model Continual Learning
Yujie Feng
Xujia Wang
Zexin Lu
Shenghong Fu
Guangyuan Shi
Yongxin Xu
Yasha Wang
Philip S. Yu
Xu Chu
Xiao-Ming Wu
CLL
KELM
41
1
0
22 Feb 2025
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Lin Lu
Zhigang Zuo
Ziji Sheng
Pan Zhou
MoMe
48
0
0
22 Feb 2025
MoMa: A Modular Deep Learning Framework for Material Property Prediction
MoMa: A Modular Deep Learning Framework for Material Property Prediction
Botian Wang
Y. Ouyang
Yaohui Li
Y. Wang
Haorui Cui
Jianbing Zhang
Xiaonan Wang
Wei-Ying Ma
Hao Zhou
44
0
0
21 Feb 2025
Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models
Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models
Lior Belenki
Alekh Agarwal
Tianze Shi
Kristina Toutanova
MoE
46
0
0
21 Feb 2025
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Jesus Rios
Pierre L. Dognin
Ronny Luss
K. Ramamurthy
27
1
0
21 Feb 2025
Robust Concept Erasure Using Task Vectors
Robust Concept Erasure Using Task Vectors
Minh Pham
Kelly O. Marshall
Chinmay Hegde
Niv Cohen
112
17
0
21 Feb 2025
Secure and Efficient Watermarking for Latent Diffusion Models in Model Distribution Scenarios
Secure and Efficient Watermarking for Latent Diffusion Models in Model Distribution Scenarios
Liangqi Lei
Keke Gai
Jing Yu
Liehuang Zhu
Qi Wu
WIGM
58
0
0
18 Feb 2025
SuperMerge: An Approach For Gradient-Based Model Merging
SuperMerge: An Approach For Gradient-Based Model Merging
Haoyu Yang
Zheng Zhang
Saket Sathe
MoMe
125
0
0
17 Feb 2025
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging
Zhixiang Wang
Zhenyu Mao
Yixuan Qiao
Yunfang Wu
Biye Li
MoMe
73
0
0
17 Feb 2025
Linear Mode Connectivity in Differentiable Tree Ensembles
Linear Mode Connectivity in Differentiable Tree Ensembles
Ryuichi Kanoh
M. Sugiyama
60
1
0
17 Feb 2025
Forget the Data and Fine-Tuning! Just Fold the Network to Compress
Forget the Data and Fine-Tuning! Just Fold the Network to Compress
Dong Wang
Haris Šikić
Lothar Thiele
O. Saukh
44
0
0
17 Feb 2025
Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy
Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy
Zhenyuan Guo
Yi Shi
Wenlong Meng
Chen Gong
Chengkun Wei
Wenzhi Chen
MoMe
60
0
0
17 Feb 2025
1bit-Merging: Dynamic Quantized Merging for Large Language Models
1bit-Merging: Dynamic Quantized Merging for Large Language Models
Shuqi Liu
Han Wu
Bowei He
Zehua Liu
Xiongwei Han
M. Yuan
Linqi Song
MoMe
MQ
61
1
0
15 Feb 2025
Superpose Singular Features for Model Merging
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
43
0
0
15 Feb 2025
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
Zehua Liu
Han Wu
Yuxuan Yao
Ruifeng She
Xiongwei Han
Tao Zhong
M. Yuan
MoMe
38
1
0
15 Feb 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Z. Wang
Muneeza Azmart
Ang Li
R. Horesh
Mikhail Yurochkin
107
1
0
11 Feb 2025
When, Where and Why to Average Weights?
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
MoMe
81
0
0
10 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
53
1
0
09 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
101
0
0
03 Feb 2025
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Pierre Ablin
Angelos Katharopoulos
Skyler Seto
David Grangier
MoMe
45
0
0
03 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Jundong Li
52
0
0
01 Feb 2025
Learning Priors of Human Motion With Vision Transformers
Learning Priors of Human Motion With Vision Transformers
Placido Falqueto
Alberto Sanfeliu
Luigi Palopoli
Daniele Fontanelli
ViT
145
0
0
30 Jan 2025
Evolutionary Optimization of Model Merging Recipes
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
98
97
0
28 Jan 2025
Previous
12345...121314
Next