ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.05482
  4. Cited By
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
    MoMe
ArXivPDFHTML

Papers citing "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"

50 / 667 papers shown
Title
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
Chenghao Fan
Zhenyi Lu
Wei Wei
Jie Tian
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
44
5
0
17 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
42
48
0
17 Jun 2024
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Rickard Brüel-Gabrielsson
Jiacheng Zhu
Onkar Bhardwaj
Leshem Choshen
Kristjan Greenewald
Mikhail Yurochkin
Justin Solomon
28
5
0
17 Jun 2024
Scale Equivariant Graph Metanetworks
Scale Equivariant Graph Metanetworks
Ioannis Kalogeropoulos
Giorgos Bouritsas
Yannis Panagakis
42
6
0
15 Jun 2024
Personalized Pieces: Efficient Personalized Large Language Models
  through Collaborative Efforts
Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts
Zhaoxuan Tan
Zheyuan Liu
Meng-Long Jiang
32
20
0
15 Jun 2024
Towards Efficient Pareto Set Approximation via Mixture of Experts Based
  Model Fusion
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion
A. Tang
Li Shen
Yong Luo
Shiwei Liu
Han Hu
Bo Du
MoMe
21
6
0
14 Jun 2024
Interpreting the Weight Space of Customized Diffusion Models
Interpreting the Weight Space of Customized Diffusion Models
Amil Dravid
Yossi Gandelsman
Kuan-Chieh Jackson Wang
Rameen Abdal
Gordon Wetzstein
Alexei A. Efros
Kfir Aberman
34
9
0
13 Jun 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
52
0
0
13 Jun 2024
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin Biggs
Arjun Seshadri
Yang Zou
Achin Jain
Aditya Golatkar
Yusheng Xie
Alessandro Achille
Ashwin Swaminathan
Stefano Soatto
MoMe
DiffM
33
10
0
12 Jun 2024
State Soup: In-Context Skill Learning, Retrieval and Mixing
State Soup: In-Context Skill Learning, Retrieval and Mixing
Maciej Pióro
Maciej Wołczyk
Razvan Pascanu
J. Oswald
João Sacramento
18
1
0
12 Jun 2024
Merging Improves Self-Critique Against Jailbreak Attacks
Merging Improves Self-Critique Against Jailbreak Attacks
Victor Gallego
AAML
MoMe
36
3
0
11 Jun 2024
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
Lu Li
T. Zhang
Zhiqi Bu
Suyuchen Wang
Huan He
Jie Fu
Yonghui Wu
Jiang Bian
Yong Chen
Yoshua Bengio
FedML
MoMe
92
3
0
11 Jun 2024
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
A. Tang
Li Shen
Yong Luo
Han Hu
Bo Du
Dacheng Tao
ELM
MoMe
VLM
34
19
0
05 Jun 2024
HPE-CogVLM: New Head Pose Grounding Task Exploration on Vision Language
  Model
HPE-CogVLM: New Head Pose Grounding Task Exploration on Vision Language Model
Yu Tian
Tianqi Shao
Tsukasa Demizu
Xuyang Wu
Hsin-Tai Wu
24
3
0
04 Jun 2024
Pretrained Hybrids with MAD Skills
Pretrained Hybrids with MAD Skills
Nicholas Roberts
Samuel Guo
Zhiqi Gao
Satya Sai Srinath Namburi
Sonia Cromp
Chengjun Wu
Chengyu Duan
Frederic Sala
Mamba
35
0
0
02 Jun 2024
On the Use of Anchoring for Training Vision Models
On the Use of Anchoring for Training Vision Models
V. Narayanaswamy
Kowshik Thopalli
Rushil Anirudh
Yamen Mubarka
W. Sakla
Jayaraman J. Thiagarajan
32
0
0
01 Jun 2024
Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles
Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles
Jiesong Lian
Yucong Huang
Chengdong Ma
Mingzhi Wang
Ying Wen
Long Hu
Yixue Hao
57
0
0
31 May 2024
TS-Align: A Teacher-Student Collaborative Framework for Scalable
  Iterative Finetuning of Large Language Models
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Chen Zhang
Chengguang Tang
Dading Chong
Ke Shi
Guohua Tang
Feng Jiang
Haizhou Li
27
4
0
30 May 2024
Weights Augmentation: it has never ever ever ever let her model down
Weights Augmentation: it has never ever ever ever let her model down
Junbin Zhuang
Guiguang Din
Yunyi Yan
16
1
0
30 May 2024
Offline Regularised Reinforcement Learning for Large Language Models
  Alignment
Offline Regularised Reinforcement Learning for Large Language Models Alignment
Pierre Harvey Richemond
Yunhao Tang
Daniel Guo
Daniele Calandriello
M. G. Azar
...
Gil Shamir
Rishabh Joshi
Tianqi Liu
Rémi Munos
Bilal Piot
OffRL
40
21
0
29 May 2024
Why are Visually-Grounded Language Models Bad at Image Classification?
Why are Visually-Grounded Language Models Bad at Image Classification?
Yuhui Zhang
Alyssa Unell
Xiaohan Wang
Dhruba Ghosh
Yuchang Su
Ludwig Schmidt
Serena Yeung-Levy
VLM
35
27
0
28 May 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training
  Durations
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
Elie Bakouch
Atli Kosson
Loubna Ben Allal
Leandro von Werra
Martin Jaggi
36
33
0
28 May 2024
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in
  Alignment
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Keming Lu
Bowen Yu
Fei Huang
Yang Fan
Runji Lin
Chang Zhou
MoMe
24
18
0
28 May 2024
PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse
  PreTrained Models from the Wild
PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Kun Yuan
Hongbo Liu
Mading Li
Muyi Sun
Ming-hui Sun
Jiachao Gong
Jinhua Hao
Chao Zhou
Yansong Tang
ViT
44
5
0
28 May 2024
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling,
  then Average
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Louis Fournier
Adel Nabli
Masih Aminbeidokhti
M. Pedersoli
Eugene Belilovsky
Edouard Oyallon
MoMe
FedML
36
3
0
27 May 2024
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
44
31
0
27 May 2024
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Cristian Rodriguez-Opazo
Ehsan Abbasnejad
Damien Teney
Edison Marrese-Taylor
Hamed Damirchi
A. Hengel
VLM
25
1
0
27 May 2024
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
Shaoyuan Xie
Lingdong Kong
Wenwei Zhang
Jiawei Ren
Liang Pan
Kai-xiang Chen
Ziwei Liu
AAML
50
9
0
27 May 2024
Ensembling Diffusion Models via Adaptive Feature Aggregation
Ensembling Diffusion Models via Adaptive Feature Aggregation
Cong Wang
Kuan Tian
Yonghang Guan
Jun Zhang
Zhiwei Jiang
Fei Shen
Xiao Han
29
5
0
27 May 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse
  Mixture-of-Experts
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Mohammed Nowaz Rabbani Chowdhury
Meng Wang
K. E. Maghraoui
Naigang Wang
Pin-Yu Chen
Christopher Carothers
MoE
24
4
0
26 May 2024
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD
  Generalization and Open-Set OOD Detection
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection
Lin Zhu
Yifeng Yang
Qinying Gu
Xinbing Wang
Cheng Zhou
Nanyang Ye
VLM
22
2
0
26 May 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language
  Models
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Akide Liu
Jing Liu
Zizheng Pan
Yefei He
Gholamreza Haffari
Bohan Zhuang
MQ
30
29
0
23 May 2024
EMR-Merging: Tuning-Free High-Performance Model Merging
EMR-Merging: Tuning-Free High-Performance Model Merging
Chenyu Huang
Peng Ye
Tao Chen
Tong He
Xiangyu Yue
Wanli Ouyang
MoMe
43
29
0
23 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via
  Alignment Tax Reduction
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
45
13
0
22 May 2024
How to train your ViT for OOD Detection
How to train your ViT for OOD Detection
Maximilian Mueller
Matthias Hein
13
0
0
21 May 2024
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural
  Networks
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Xin-Chun Li
Lan Li
De-Chuan Zhan
25
2
0
21 May 2024
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Xin-Chun Li
Jinli Tang
Bo Zhang
Lan Li
De-Chuan Zhan
28
2
0
21 May 2024
URDFormer: A Pipeline for Constructing Articulated Simulation
  Environments from Real-World Images
URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images
Zoey Chen
Aaron Walsman
Marius Memmel
Kaichun Mo
Alex Fang
Karthikeya Vemuri
Alan Wu
Dieter Fox
Abhishek Gupta
AI4CE
VGen
56
24
0
19 May 2024
Learning More Generalized Experts by Merging Experts in
  Mixture-of-Experts
Learning More Generalized Experts by Merging Experts in Mixture-of-Experts
Sejik Park
FedML
CLL
MoMe
25
5
0
19 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Page-Caccia
Alessandro Sordoni
MoMe
32
30
0
18 May 2024
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion
  Models via Watermark LoRA
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
Weitao Feng
Wenbo Zhou
Jiyan He
Jie Zhang
Tianyi Wei
Guanlin Li
Tianwei Zhang
Weiming Zhang
Neng H. Yu
21
17
0
18 May 2024
A safety realignment framework via subspace-oriented model fusion for
  large language models
A safety realignment framework via subspace-oriented model fusion for large language models
Xin Yi
Shunfan Zheng
Linlin Wang
Xiaoling Wang
Liang He
43
20
0
15 May 2024
Cross-Dataset Generalization For Retinal Lesions Segmentation
Cross-Dataset Generalization For Retinal Lesions Segmentation
Clément Playout
Farida Cheriet
16
1
0
14 May 2024
The Platonic Representation Hypothesis
The Platonic Representation Hypothesis
Minyoung Huh
Brian Cheung
Tongzhou Wang
Phillip Isola
72
109
0
13 May 2024
Zero-Shot Tokenizer Transfer
Zero-Shot Tokenizer Transfer
Benjamin Minixhofer
E. Ponti
Ivan Vulić
VLM
44
9
0
13 May 2024
Localizing Task Information for Improved Model Merging and Compression
Localizing Task Information for Improved Model Merging and Compression
Ke Wang
Nikolaos Dimitriadis
Guillermo Ortiz-Jimenez
Franccois Fleuret
Pascal Frossard
MoMe
30
43
0
13 May 2024
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive
  Language Model Pre-training
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training
Zexuan Zhong
Mengzhou Xia
Danqi Chen
Mike Lewis
MoE
49
15
0
06 May 2024
Adapting to Distribution Shift by Visual Domain Prompt Generation
Adapting to Distribution Shift by Visual Domain Prompt Generation
Zhixiang Chi
Li Gu
Tao Zhong
Huan Liu
Yuanhao Yu
Konstantinos N Plataniotis
Yang Wang
VLM
OOD
29
7
0
05 May 2024
CNN-LSTM and Transfer Learning Models for Malware Classification based
  on Opcodes and API Calls
CNN-LSTM and Transfer Learning Models for Malware Classification based on Opcodes and API Calls
A. Bensaoud
Jugal Kalita
19
13
0
04 May 2024
Aloe: A Family of Fine-tuned Open Healthcare LLMs
Aloe: A Family of Fine-tuned Open Healthcare LLMs
Ashwin Kumar Gururajan
Enrique Lopez-Cuena
Jordi Bayarri-Planas
Adrián Tormos
Daniel Hinjos
...
Lucia Urcelay-Ganzabal
Marta Gonzalez-Mallo
Sergio Álvarez Napagao
Eduard Ayguadé-Parra
Ulises Cortés Dario Garcia-Gasulla
ELM
LM&MA
29
12
0
03 May 2024
Previous
123...567...121314
Next