ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.05482
  4. Cited By
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
    MoMe
ArXivPDFHTML

Papers citing "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"

50 / 667 papers shown
Title
Weight Scope Alignment: A Frustratingly Easy Method for Model Merging
Weight Scope Alignment: A Frustratingly Easy Method for Model Merging
Yichu Xu
Xin-Chun Li
Le Gan
De-Chuan Zhan
MoMe
25
0
0
22 Aug 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
37
3
0
21 Aug 2024
MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in
  Code LLMs for Automated Program Repair
MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair
Meghdad Dehghan
Jie JW Wu
Fatemeh H. Fard
Ali Ouni
MoMe
35
2
0
18 Aug 2024
Activated Parameter Locating via Causal Intervention for Model Merging
Activated Parameter Locating via Causal Intervention for Model Merging
Fanshuang Kong
Richong Zhang
Ziqiao Wang
MoMe
11
1
0
18 Aug 2024
Learning to Route for Dynamic Adapter Composition in Continual Learning
  with Language Models
Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models
Vladimir Araujo
Marie-Francine Moens
Tinne Tuytelaars
CLL
MoMe
21
2
0
16 Aug 2024
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Sungyeon Kim
Boseung Jeong
Donghyun Kim
Suha Kwak
VLM
26
2
0
11 Aug 2024
UNIC: Universal Classification Models via Multi-teacher Distillation
UNIC: Universal Classification Models via Multi-teacher Distillation
Mert Bulent Sariyildiz
Philippe Weinzaepfel
Thomas Lucas
Diane Larlus
Yannis Kalantidis
23
6
0
09 Aug 2024
ProFuser: Progressive Fusion of Large Language Models
ProFuser: Progressive Fusion of Large Language Models
Tianyuan Shi
Fanqi Wan
Canbin Huang
Xiaojun Quan
Chenliang Li
Ming Yan
Ji Zhang
MoMe
21
2
0
09 Aug 2024
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language
  Models via Weight Disentanglement
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement
Le Yu
Bowen Yu
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
27
5
0
06 Aug 2024
Conditional LoRA Parameter Generation
Conditional LoRA Parameter Generation
Aaron Mueller
Millicent Li
Koyena Pal
Wangbo Zhao
Yukun Zhou
Jiuding Sun
Yonatan Belinkov
DiffM
36
3
0
02 Aug 2024
POA: Pre-training Once for Models of All Sizes
POA: Pre-training Once for Models of All Sizes
Yingying Zhang
Xin Guo
Jiangwei Lao
Lei Yu
Lixiang Ru
Jian Wang
Guo Ye
Huimei He
Jingdong Chen
Ming Yang
53
1
0
02 Aug 2024
Machine Unlearning in Generative AI: A Survey
Machine Unlearning in Generative AI: A Survey
Zheyuan Liu
Guangyao Dou
Zhaoxuan Tan
Yijun Tian
Meng-Long Jiang
MU
31
13
0
30 Jul 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Zhihang Lin
Zhihang Lin
Yushun Zhang
Tian Ding
Ruoyu Sun
Ruoyu Sun
CLL
72
1
0
30 Jul 2024
A Method for Fast Autonomy Transfer in Reinforcement Learning
A Method for Fast Autonomy Transfer in Reinforcement Learning
D. Sahabandu
Bhaskar Ramasubramanian
M. Alexiou
J. S. Mertoguno
L. Bushnell
Radha Poovendran
18
0
0
29 Jul 2024
Strong Copyright Protection for Language Models via Adaptive Model
  Fusion
Strong Copyright Protection for Language Models via Adaptive Model Fusion
Javier Abad
Konstantin Donhauser
Francesco Pinto
Fanny Yang
35
4
0
29 Jul 2024
Cool-Fusion: Fuse Large Language Models without Training
Cool-Fusion: Fuse Large Language Models without Training
Cong Liu
Xiaojun Quan
Yan Pan
Liangzhi Li
Weigang Wu
Xu Chen
VLM
MoMe
46
3
0
29 Jul 2024
Logifold: A Geometrical Foundation of Ensemble Machine Learning
Logifold: A Geometrical Foundation of Ensemble Machine Learning
Inkee Jung
Siu-Cheong Lau
FedML
AI4CE
25
1
0
23 Jul 2024
Computer Audition: From Task-Specific Machine Learning to Foundation
  Models
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Andreas Triantafyllopoulos
Iosif Tsangko
Alexander Gebhard
A. Mesaros
Tuomas Virtanen
Björn Schuller
39
4
0
22 Jul 2024
Can VLMs be used on videos for action recognition? LLMs are Visual
  Reasoning Coordinators
Can VLMs be used on videos for action recognition? LLMs are Visual Reasoning Coordinators
Harsh Lunia
32
0
0
20 Jul 2024
Antibody DomainBed: Out-of-Distribution Generalization in Therapeutic
  Protein Design
Antibody DomainBed: Out-of-Distribution Generalization in Therapeutic Protein Design
Natavsa Tagasovska
Ji Won Park
Matthieu Kirchmeyer
Nathan C. Frey
Andrew Watkins
...
Arian R. Jamasb
Edith Lee
Tyler Bryson
Stephen Ra
Kyunghyun Cho
OOD
26
6
0
15 Jul 2024
Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular
  Prediction with Tree-hybrid MLPs
Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs
Jiahuan Yan
Jintai Chen
Qianxing Wang
D. Z. Chen
Jian Wu
24
3
0
13 Jul 2024
Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a
  Final Time Point
Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point
Christina X. Ji
Ahmed M. Alaa
David Sontag
OOD
42
0
0
12 Jul 2024
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse
  Mixture-of-Experts
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Zeliang Zhang
Xiaodong Liu
Hao Cheng
Chenliang Xu
Jianfeng Gao
MoE
30
9
0
12 Jul 2024
SoupLM: Model Integration in Large Language and Multi-Modal Models
SoupLM: Model Integration in Large Language and Multi-Modal Models
Yue Bai
Zichen Zhang
Jiasen Lu
Yun Fu
MoMe
22
1
0
11 Jul 2024
Scaling Up Personalized Aesthetic Assessment via Task Vector
  Customization
Scaling Up Personalized Aesthetic Assessment via Task Vector Customization
Jooyeol Yun
Jaegul Choo
MoMe
20
2
0
09 Jul 2024
MagMax: Leveraging Model Merging for Seamless Continual Learning
MagMax: Leveraging Model Merging for Seamless Continual Learning
Daniel Marczak
Bartłomiej Twardowski
Tomasz Trzciñski
Sebastian Cygert
MoMe
CLL
34
17
0
08 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
29
17
0
08 Jul 2024
Harmony in Diversity: Merging Neural Networks with Canonical Correlation
  Analysis
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Stefan Horoi
Albert Manuel Orozco Camacho
Eugene Belilovsky
Guy Wolf
FedML
MoMe
19
9
0
07 Jul 2024
Unlocking the Potential of Model Merging for Low-Resource Languages
Unlocking the Potential of Model Merging for Low-Resource Languages
Mingxu Tao
Chen Zhang
Quzhe Huang
Tianyao Ma
Songfang Huang
Dongyan Zhao
Yansong Feng
CLL
MoMe
20
3
0
04 Jul 2024
Learning Scalable Model Soup on a Single GPU: An Efficient Subspace
  Training Strategy
Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy
Tao Li
Weisen Jiang
Fanghui Liu
X. Huang
James T. Kwok
MoMe
51
1
0
04 Jul 2024
SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning
SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
Lukas Mauch
Marzieh Edraki
Aaron Courville
OODD
CLL
VLM
52
3
0
03 Jul 2024
Knowledge Composition using Task Vectors with Learned Anisotropic
  Scaling
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling
Frederic Z. Zhang
Paul Albert
Cristian Rodriguez-Opazo
Anton van den Hengel
Ehsan Abbasnejad
MoMe
37
7
0
03 Jul 2024
DogeRM: Equipping Reward Models with Domain Knowledge through Model
  Merging
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
Tzu-Han Lin
Chen An Li
Hung-yi Lee
Yun-Nung Chen
VLM
ALM
26
4
0
01 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models:
  Enhancing Performance and Reducing Inference Costs
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
52
5
0
01 Jul 2024
An Attribute Interpolation Method in Speech Synthesis by Model Merging
An Attribute Interpolation Method in Speech Synthesis by Model Merging
Masato Murata
Koichi Miyazaki
Tomoki Koriyama
MoMe
35
3
0
30 Jun 2024
It's Morphing Time: Unleashing the Potential of Multiple LLMs via
  Multi-objective Optimization
It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization
Bingdong Li
Zixiang Di
Yanting Yang
Hong Qian
Peng Yang
Hao Hao
Ke Tang
Aimin Zhou
MoMe
19
5
0
29 Jun 2024
Decoding-Time Language Model Alignment with Multiple Objectives
Decoding-Time Language Model Alignment with Multiple Objectives
Ruizhe Shi
Yifang Chen
Yushi Hu
Alisa Liu
Hannaneh Hajishirzi
Noah A. Smith
Simon Du
44
30
0
27 Jun 2024
VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
J. C. V. Gemert
VLM
29
0
0
26 Jun 2024
Sequential Editing for Lifelong Training of Speech Recognition Models
Sequential Editing for Lifelong Training of Speech Recognition Models
Devang Kulshreshtha
Saket Dingliwal
Brady C. Houston
Nikolaos Pappas
S. Ronanki
KELM
CLL
21
1
0
25 Jun 2024
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda
Berivan Isik
Xiangyu Qi
Sanmi Koyejo
Tsachy Weissman
Prateek Mittal
MoMe
45
12
0
24 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
47
13
0
24 Jun 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer
  Merging
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu
Zhanyue Qin
Hairu Wang
Zhao Yang
Zecheng Wang
...
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
15
2
0
24 Jun 2024
DEM: Distribution Edited Model for Training with Mixed Data
  Distributions
DEM: Distribution Edited Model for Training with Mixed Data Distributions
Dhananjay Ram
Aditya Rawal
Momchil Hardalov
Nikolaos Pappas
Sheng Zha
MoMe
25
1
0
21 Jun 2024
MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
Zhuoxiao Chen
Junjie Meng
Mahsa Baktashmotlagh
Yonggang Zhang
Zi Huang
Yadan Luo
72
0
0
21 Jun 2024
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Hasan Hammoud
Umberto Michieli
Fabio Pizzati
Philip H. S. Torr
Adel Bibi
Bernard Ghanem
Mete Ozay
MoMe
31
14
0
20 Jun 2024
On the Utility of Domain-Adjacent Fine-Tuned Model Ensembles for
  Few-shot Problems
On the Utility of Domain-Adjacent Fine-Tuned Model Ensembles for Few-shot Problems
Md. Ibrahim Ibne Alam
Parikshit Ram
Soham Dan
Horst Samulowitz
Koushik Kar
36
0
0
19 Jun 2024
Knowledge Fusion By Evolving Weights of Language Models
Knowledge Fusion By Evolving Weights of Language Models
Guodong Du
Jing Li
Hanting Liu
Runhua Jiang
Shuyang Yu
Yifei Guo
S. Goh
Ho-Kin Tang
MoMe
33
8
0
18 Jun 2024
Self-MoE: Towards Compositional Large Language Models with
  Self-Specialized Experts
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Junmo Kang
Leonid Karlinsky
Hongyin Luo
Zhen Wang
Jacob A. Hansen
James Glass
David D. Cox
Rameswar Panda
Rogerio Feris
Alan Ritter
MoMe
MoE
34
8
0
17 Jun 2024
On Efficient Language and Vision Assistants for Visually-Situated
  Natural Language Understanding: What Matters in Reading and Reasoning
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning
Geewook Kim
Minjoon Seo
VLM
29
2
0
17 Jun 2024
MetaGPT: Merging Large Language Models Using Model Exclusive Task
  Arithmetic
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic
Yuyan Zhou
Liang Song
Bingning Wang
Weipeng Chen
MoMe
28
15
0
17 Jun 2024
Previous
123456...121314
Next