ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.05482
  4. Cited By
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
    MoMe
ArXivPDFHTML

Papers citing "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"

50 / 667 papers shown
Title
IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced
  Learning Tasks
IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks
Zitong Huang
Ze Chen
Bowen Dong
Chaoqi Liang
Erjin Zhou
Wangmeng Zuo
MoMe
CLL
44
3
0
25 Apr 2024
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging
  Upcycled Mixture-of-Experts
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
Yifeng Ding
Jiawei Liu
Yuxiang Wei
Terry Yue Zhuo
Lingming Zhang
ALM
MoE
40
3
0
23 Apr 2024
Advances and Open Challenges in Federated Learning with Foundation
  Models
Advances and Open Challenges in Federated Learning with Foundation Models
Chao Ren
Han Yu
Hongyi Peng
Xiaoli Tang
Anran Li
...
A. Tan
Bo Zhao
Xiaoxiao Li
Zengxiang Li
Qiang Yang
FedML
AIFin
AI4CE
68
6
0
23 Apr 2024
A Survey on Self-Evolution of Large Language Models
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
49
21
0
22 Apr 2024
DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning
  for Medical Images
DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning for Medical Images
Mohammad Areeb Qazi
Ibrahim Almakky
Anees Ur Rehman Hashmi
Santosh Sanjeev
Mohammad Yaqub
MoMe
29
3
0
22 Apr 2024
One-Shot Sequential Federated Learning for Non-IID Data by Enhancing
  Local Model Diversity
One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity
Naibo Wang
Yuchen Deng
Wenjie Feng
Shichen Fan
Jianwei Yin
See-Kiong Ng
FedML
24
5
0
18 Apr 2024
In-Context Learning State Vector with Inner and Momentum Optimization
In-Context Learning State Vector with Inner and Momentum Optimization
Dongfang Li
Zhenyu Liu
Xinshuo Hu
Zetian Sun
Baotian Hu
Min Zhang
24
5
0
17 Apr 2024
Stepwise Alignment for Constrained Language Model Policy Optimization
Stepwise Alignment for Constrained Language Model Policy Optimization
Akifumi Wachi
Thien Q. Tran
Rei Sato
Takumi Tanabe
Yohei Akimoto
34
5
0
17 Apr 2024
AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees
AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees
William Fleshman
Aleem Khan
Marc Marone
Benjamin Van Durme
CLL
KELM
42
3
0
12 Apr 2024
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning
  Models
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models
Nastaran Saadati
Minh Pham
Nasla Saleem
Joshua R. Waite
Aditya Balu
Zhanhong Jiang
Chinmay Hegde
Soumik Sarkar
MoMe
35
1
0
11 Apr 2024
Post-Hoc Reversal: Are We Selecting Models Prematurely?
Post-Hoc Reversal: Are We Selecting Models Prematurely?
Rishabh Ranjan
Saurabh Garg
Mrigank Raman
Carlos Guestrin
Zachary Chase Lipton
32
0
0
11 Apr 2024
Scalable Language Model with Generalized Continual Learning
Scalable Language Model with Generalized Continual Learning
Bohao Peng
Zhuotao Tian
Shu Liu
Mingchang Yang
Jiaya Jia
ALM
CLL
KELM
16
12
0
11 Apr 2024
Continuous Language Model Interpolation for Dynamic and Controllable
  Text Generation
Continuous Language Model Interpolation for Dynamic and Controllable Text Generation
Sara Kangaslahti
David Alvarez-Melis
KELM
29
0
0
10 Apr 2024
Have You Merged My Model? On The Robustness of Large Language Model IP
  Protection Methods Against Model Merging
Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging
Tianshuo Cong
Delong Ran
Zesen Liu
Xinlei He
Jinyuan Liu
Yichen Gong
Qi Li
Anyu Wang
Xiaoyun Wang
MoMe
38
7
0
08 Apr 2024
Lossless and Near-Lossless Compression for Foundation Models
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
42
5
0
05 Apr 2024
Samba: Semantic Segmentation of Remotely Sensed Images with State Space
  Model
Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model
Qinfeng Zhu
Yuanzhi Cai
Yuan-Sheng Fang
Yihan Yang
Cheng Chen
Lei Fan
Anh Nguyen
Mamba
43
55
0
02 Apr 2024
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
En-hao Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Shuaiqi Wang
...
Sergey Yekhanin
Guohao Dai
Huazhong Yang
Yu-Xiang Wang
Yu Wang
MoMe
55
4
0
02 Apr 2024
Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text
  Guidance
Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text Guidance
G. Nam
Byeongho Heo
Juho Lee
VLM
29
5
0
01 Apr 2024
Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real
  World
Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World
Bowen Lei
Dongkuan Xu
Ruqi Zhang
Bani Mallick
UQCV
29
0
0
29 Mar 2024
Diverse Feature Learning by Self-distillation and Reset
Diverse Feature Learning by Self-distillation and Reset
Sejik Park
CLL
32
1
0
29 Mar 2024
Model Stock: All we need is just a few fine-tuned models
Model Stock: All we need is just a few fine-tuned models
Dong-Hwan Jang
Sangdoo Yun
Dongyoon Han
OODD
MoMe
27
38
0
28 Mar 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
31
14
0
28 Mar 2024
On the Benefits of Over-parameterization for Out-of-Distribution
  Generalization
On the Benefits of Over-parameterization for Out-of-Distribution Generalization
Yifan Hao
Yong Lin
Difan Zou
Tong Zhang
OODD
OOD
21
4
0
26 Mar 2024
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video
  Differentiable AutoAugmentation and Fusion
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion
S. Casarin
C. Ugwu
Sergio Escalera
O. Lanz
28
0
0
22 Mar 2024
FissionFusion: Fast Geometric Generation and Hierarchical Souping for
  Medical Image Analysis
FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis
Santosh Sanjeev
Nuren Zhaksylyk
Ibrahim Almakky
Anees Ur Rehman Hashmi
Mohammad Areeb Qazi
Mohammad Yaqub
28
3
0
20 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
80
76
0
20 Mar 2024
Generalizable and Stable Finetuning of Pretrained Language Models on
  Low-Resource Texts
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts
Sai Ashish Somayajula
Youwei Liang
Abhishek Singh
Li Zhang
Pengtao Xie
22
2
0
19 Mar 2024
MISS: Memory-efficient Instance Segmentation Framework By Visual
  Inductive Priors Flow Propagation
MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation
Chih-Chung Hsu
Chia-Ming Lee
VLM
22
1
0
18 Mar 2024
Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance
  Segmentation Framework for Sport-scenes
Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes
Chih-Chung Hsu
Chia-Ming Lee
Ming-Shyen Wu
VLM
25
1
0
18 Mar 2024
MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks
MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks
Ibrahim Almakky
Santosh Sanjeev
Anees Ur Rehman Hashmi
Mohammad Areeb Qazi
Mohammad Yaqub
Mohammad Yaqub
FedML
MoMe
69
3
0
18 Mar 2024
Fisher Mask Nodes for Language Model Merging
Fisher Mask Nodes for Language Model Merging
Thennal D K
Ganesh Nathan
Suchithra M S
MoMe
AI4CE
40
3
0
14 Mar 2024
DAM: Dynamic Adapter Merging for Continual Video QA Learning
DAM: Dynamic Adapter Merging for Continual Video QA Learning
Feng Cheng
Ziyang Wang
Yi-Lin Sung
Yan-Bo Lin
Mohit Bansal
Gedas Bertasius
CLL
MoMe
31
10
0
13 Mar 2024
Human Alignment of Large Language Models through Online Preference
  Optimisation
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello
Daniel Guo
Rémi Munos
Mark Rowland
Yunhao Tang
...
Michal Valko
Tianqi Liu
Rishabh Joshi
Zeyu Zheng
Bilal Piot
44
60
0
13 Mar 2024
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Sainbayar Sukhbaatar
O. Yu. Golovneva
Vasu Sharma
Hu Xu
Xi Victoria Lin
...
Jacob Kahn
Shang-Wen Li
Wen-tau Yih
Jason Weston
Xian Li
MoMe
OffRL
MoE
30
60
0
12 Mar 2024
Enhancing Transfer Learning with Flexible Nonparametric Posterior
  Sampling
Enhancing Transfer Learning with Flexible Nonparametric Posterior Sampling
Hyungi Lee
G. Nam
Edwin Fong
Juho Lee
BDL
20
5
0
12 Mar 2024
Bridging Domains with Approximately Shared Features
Bridging Domains with Approximately Shared Features
Ziliang Samuel Zhong
Xiang Pan
Qi Lei
OOD
21
1
0
11 Mar 2024
Select High-Level Features: Efficient Experts from a Hierarchical
  Classification Network
Select High-Level Features: Efficient Experts from a Hierarchical Classification Network
A. Kelm
Niels Hannemann
Bruno Heberle
Lucas Schmidt
Tim Rolff
Christian Wilms
Ehsan Yaghoubi
Simone Frintrop
16
0
0
08 Mar 2024
Online Adaptation of Language Models with a Memory of Amortized Contexts
Online Adaptation of Language Models with a Memory of Amortized Contexts
Jihoon Tack
Jaehyung Kim
Eric Mitchell
Jinwoo Shin
Yee Whye Teh
Jonathan Richard Schwarz
KELM
40
18
0
07 Mar 2024
Revisiting Confidence Estimation: Towards Reliable Failure Prediction
Revisiting Confidence Estimation: Towards Reliable Failure Prediction
Fei Zhu
Xu-Yao Zhang
Zhen Cheng
Cheng-Lin Liu
UQCV
42
10
0
05 Mar 2024
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Hoang Phan
Andrew Gordon Wilson
Qi Lei
36
5
0
05 Mar 2024
A Survey on Evaluation of Out-of-Distribution Generalization
A Survey on Evaluation of Out-of-Distribution Generalization
Han Yu
Jiashuo Liu
Xingxuan Zhang
Jiayun Wu
Peng Cui
OOD
42
9
0
04 Mar 2024
Merging Text Transformer Models from Different Initializations
Merging Text Transformer Models from Different Initializations
Neha Verma
Maha Elbayad
MoMe
48
7
0
01 Mar 2024
Fine-tuning with Very Large Dropout
Fine-tuning with Very Large Dropout
Jianyu Zhang
Léon Bottou
37
1
0
01 Mar 2024
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Ansh Arora
Xuanli He
Maximilian Mozes
Srinibas Swain
Mark Dras
Qiongkai Xu
SILM
MoMe
AAML
54
12
0
29 Feb 2024
AdaMergeX: Cross-Lingual Transfer with Large Language Models via
  Adaptive Adapter Merging
AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging
Yiran Zhao
Wenxuan Zhang
Huiming Wang
Kenji Kawaguchi
Lidong Bing
MoMe
32
15
0
29 Feb 2024
Adversarial Example Soups: Improving Transferability and Stealthiness for Free
Adversarial Example Soups: Improving Transferability and Stealthiness for Free
Bo Yang
Hengwei Zhang
Jin-dong Wang
Yulong Yang
Chenhao Lin
Chao Shen
Zhengyu Zhao
SILM
AAML
57
1
0
27 Feb 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
25
10
0
26 Feb 2024
Does Combining Parameter-efficient Modules Improve Few-shot Transfer
  Accuracy?
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?
Nader Asadi
Mahdi Beitollahi
Yasser H. Khalil
Yinchuan Li
Guojun Zhang
Xi Chen
MoMe
25
8
0
23 Feb 2024
OPDAI at SemEval-2024 Task 6: Small LLMs can Accelerate Hallucination
  Detection with Weakly Supervised Data
OPDAI at SemEval-2024 Task 6: Small LLMs can Accelerate Hallucination Detection with Weakly Supervised Data
Chengcheng Wei
Ze Chen
Songtan Fang
Jiarong He
Max Gao
21
3
0
20 Feb 2024
Learning the Unlearned: Mitigating Feature Suppression in Contrastive
  Learning
Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
Jihai Zhang
Xiang Lan
Xiaoye Qu
Yu Cheng
Mengling Feng
Bryan Hooi
SSL
19
4
0
19 Feb 2024
Previous
123...678...121314
Next