Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.03044
Cited By
Fusing finetuned models for better pretraining
6 April 2022
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
FedML
AI4CE
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fusing finetuned models for better pretraining"
50 / 79 papers shown
Title
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Marco Braga
Pranav Kasela
Alessandro Raganato
G. Pasi
RALM
61
0
0
01 May 2025
From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches
Wei Ruan
Tianze Yang
Y. Zhou
Tianming Liu
Jin Lu
MoMe
88
0
0
13 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
38
0
0
13 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
65
2
0
07 Mar 2025
LEWIS (LayEr WIse Sparsity) -- A Training Free Guided Model Merging Approach
Hetarth Chopra
Vidhi Rambhia
Vikram Adve
MoMe
60
0
0
05 Mar 2025
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Qianli Ma
Dongrui Liu
Qian Chen
Linfeng Zhang
Jing Shao
MoMe
54
0
0
24 Feb 2025
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
38
0
0
15 Feb 2025
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
Zehua Liu
Han Wu
Yuxuan Yao
Ruifeng She
Xiongwei Han
Tao Zhong
M. Yuan
MoMe
38
1
0
15 Feb 2025
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Pierre Ablin
Angelos Katharopoulos
Skyler Seto
David Grangier
MoMe
45
0
0
03 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Jundong Li
52
0
0
01 Feb 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
54
17
0
08 Jan 2025
SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation
Runtao Liu
Chen I Chieh
Jindong Gu
Jipeng Zhang
Renjie Pi
Qifeng Chen
Philip H. S. Torr
Ashkan Khakzar
Fabio Pizzati
EGVM
99
0
0
13 Dec 2024
Enhancing Perception Capabilities of Multimodal LLMs with Training-Free Fusion
Zhuokun Chen
Jinwu Hu
Zeshuai Deng
Yufeng Wang
Bohan Zhuang
Mingkui Tan
69
0
0
02 Dec 2024
ATM: Improving Model Merging by Alternating Tuning and Merging
Luca Zhou
Daniele Solombrino
Donato Crisostomi
Maria Sofia Bucarelli
Fabrizio Silvestri
Emanuele Rodolà
MoMe
34
4
0
05 Nov 2024
Model merging with SVD to tie the Knots
George Stoica
Pratik Ramesh
B. Ecsedi
Leshem Choshen
Judy Hoffman
MoMe
18
8
0
25 Oct 2024
Tracking Universal Features Through Fine-Tuning and Model Merging
Niels Horn
Desmond Elliott
MoMe
24
0
0
16 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Mohit Bansal
Tianlong Chen
MoMe
MoE
25
1
0
09 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization
Saqib Javed
Hieu Le
Mathieu Salzmann
OOD
MQ
20
1
0
08 Oct 2024
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Mohit Bansal
Tsendsuren Munkhdalai
MoMe
44
12
0
04 Oct 2024
Parameter Competition Balancing for Model Merging
Guodong Du
Junlin Lee
Jing Li
Runhua Jiang
Yifei Guo
...
Hanting Liu
S. Goh
Ho-Kin Tang
Daojing He
Min Zhang
MoMe
17
10
0
03 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
45
0
0
02 Oct 2024
Realistic Evaluation of Model Merging for Compositional Generalization
Derek Tam
Yash Kant
Brian Lester
Igor Gilitschenski
Colin Raffel
MoMe
16
5
0
26 Sep 2024
Acceptable Use Policies for Foundation Models
Kevin Klyman
20
14
0
29 Aug 2024
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
T. Dao
Thuan Hoang Nguyen
T. Le
D. Vu
Khoi Nguyen
Cuong Pham
Anh Tran
DiffM
26
11
0
26 Aug 2024
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
Anton Alexandrov
Veselin Raychev
Mark Niklas Muller
Ce Zhang
Martin Vechev
Kristina Toutanova
MoMe
CLL
KELM
25
13
0
11 Jul 2024
Foundation Model Engineering: Engineering Foundation Models Just as Engineering Software
Dezhi Ran
Mengzhou Wu
Wei Yang
Tao Xie
AI4CE
19
1
0
11 Jul 2024
Unlocking the Potential of Model Merging for Low-Resource Languages
Mingxu Tao
Chen Zhang
Quzhe Huang
Tianyao Ma
Songfang Huang
Dongyan Zhao
Yansong Feng
CLL
MoMe
20
3
0
04 Jul 2024
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Rickard Brüel-Gabrielsson
Jiacheng Zhu
Onkar Bhardwaj
Leshem Choshen
Kristjan Greenewald
Mikhail Yurochkin
Justin Solomon
28
5
0
17 Jun 2024
Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts
Zhaoxuan Tan
Zheyuan Liu
Meng-Long Jiang
27
19
0
15 Jun 2024
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin Biggs
Arjun Seshadri
Yang Zou
Achin Jain
Aditya Golatkar
Yusheng Xie
Alessandro Achille
Ashwin Swaminathan
Stefano Soatto
MoMe
DiffM
20
10
0
12 Jun 2024
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
38
30
0
27 May 2024
Learning More Generalized Experts by Merging Experts in Mixture-of-Experts
Sejik Park
FedML
CLL
MoMe
17
5
0
19 May 2024
A Federated Learning Approach to Privacy Preserving Offensive Language Identification
Marcos Zampieri
Damith Premasiri
Tharindu Ranasinghe
FedML
14
2
0
17 Apr 2024
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
36
5
0
05 Apr 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
66
75
0
20 Mar 2024
FedFisher: Leveraging Fisher Information for One-Shot Federated Learning
Divyansh Jhunjhunwala
Shiqiang Wang
Gauri Joshi
FedML
16
6
0
19 Mar 2024
Fisher Mask Nodes for Language Model Merging
Thennal D K
Ganesh Nathan
Suchithra M S
MoMe
AI4CE
32
3
0
14 Mar 2024
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Ansh Arora
Xuanli He
Maximilian Mozes
Srinibas Swain
Mark Dras
Qiongkai Xu
SILM
MoMe
AAML
39
12
0
29 Feb 2024
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?
Nader Asadi
Mahdi Beitollahi
Yasser H. Khalil
Yinchuan Li
Guojun Zhang
Xi Chen
MoMe
25
8
0
23 Feb 2024
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
Rishabh Bhardwaj
Do Duc Anh
Soujanya Poria
MoMe
45
35
0
19 Feb 2024
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
Yanis Labrak
Adrien Bazoge
Emmanuel Morin
P. Gourraud
Mickael Rouvier
Richard Dufour
91
188
0
15 Feb 2024
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Mohammed Muqeeth
Haokun Liu
Yufan Liu
Colin Raffel
MoMe
25
14
0
08 Feb 2024
Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Anirudh S. Sundar
Chao-Han Huck Yang
David M. Chan
Shalini Ghosh
Venkatesh Ravichandran
P. S. Nidadavolu
MoMe
28
8
0
22 Dec 2023
Weighted Ensemble Models Are Strong Continual Learners
Imad Eddine Marouf
Subhankar Roy
Enzo Tartaglione
Stéphane Lathuilière
CLL
24
16
0
14 Dec 2023
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Mohammad-Javad Davari
Eugene Belilovsky
MoMe
19
54
0
11 Dec 2023
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
13
10
0
07 Dec 2023
ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Prateek Yadav
Leshem Choshen
Colin Raffel
Mohit Bansal
19
12
0
22 Nov 2023
Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion
Kerem Zaman
Leshem Choshen
Shashank Srivastava
KELM
MoMe
13
10
0
13 Nov 2023
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Haoxiang Wang
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Mehrdad Farajtabar
Sachin Mehta
Mohammad Rastegari
Oncel Tuzel
Hadi Pouransari
VLM
9
65
0
23 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
13
33
0
02 Oct 2023
1
2
Next