ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.03044
  4. Cited By
Fusing finetuned models for better pretraining

Fusing finetuned models for better pretraining

6 April 2022
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
    FedML
    AI4CE
    MoMe
ArXivPDFHTML

Papers citing "Fusing finetuned models for better pretraining"

50 / 79 papers shown
Title
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Marco Braga
Pranav Kasela
Alessandro Raganato
G. Pasi
RALM
61
0
0
01 May 2025
From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches
Wei Ruan
Tianze Yang
Y. Zhou
Tianming Liu
Jin Lu
MoMe
88
0
0
13 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
38
0
0
13 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
65
2
0
07 Mar 2025
LEWIS (LayEr WIse Sparsity) -- A Training Free Guided Model Merging Approach
Hetarth Chopra
Vidhi Rambhia
Vikram Adve
MoMe
60
0
0
05 Mar 2025
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Qianli Ma
Dongrui Liu
Qian Chen
Linfeng Zhang
Jing Shao
MoMe
54
0
0
24 Feb 2025
Superpose Singular Features for Model Merging
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
38
0
0
15 Feb 2025
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
Zehua Liu
Han Wu
Yuxuan Yao
Ruifeng She
Xiongwei Han
Tao Zhong
M. Yuan
MoMe
38
1
0
15 Feb 2025
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Pierre Ablin
Angelos Katharopoulos
Skyler Seto
David Grangier
MoMe
45
0
0
03 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Jundong Li
52
0
0
01 Feb 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
54
17
0
08 Jan 2025
SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation
SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation
Runtao Liu
Chen I Chieh
Jindong Gu
Jipeng Zhang
Renjie Pi
Qifeng Chen
Philip H. S. Torr
Ashkan Khakzar
Fabio Pizzati
EGVM
99
0
0
13 Dec 2024
Enhancing Perception Capabilities of Multimodal LLMs with Training-Free
  Fusion
Enhancing Perception Capabilities of Multimodal LLMs with Training-Free Fusion
Zhuokun Chen
Jinwu Hu
Zeshuai Deng
Yufeng Wang
Bohan Zhuang
Mingkui Tan
69
0
0
02 Dec 2024
ATM: Improving Model Merging by Alternating Tuning and Merging
ATM: Improving Model Merging by Alternating Tuning and Merging
Luca Zhou
Daniele Solombrino
Donato Crisostomi
Maria Sofia Bucarelli
Fabrizio Silvestri
Emanuele Rodolà
MoMe
34
4
0
05 Nov 2024
Model merging with SVD to tie the Knots
Model merging with SVD to tie the Knots
George Stoica
Pratik Ramesh
B. Ecsedi
Leshem Choshen
Judy Hoffman
MoMe
18
8
0
25 Oct 2024
Tracking Universal Features Through Fine-Tuning and Model Merging
Tracking Universal Features Through Fine-Tuning and Model Merging
Niels Horn
Desmond Elliott
MoMe
24
0
0
16 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Mohit Bansal
Tianlong Chen
MoMe
MoE
25
1
0
09 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization
QT-DoG: Quantization-aware Training for Domain Generalization
Saqib Javed
Hieu Le
Mathieu Salzmann
OOD
MQ
20
1
0
08 Oct 2024
What Matters for Model Merging at Scale?
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Mohit Bansal
Tsendsuren Munkhdalai
MoMe
44
12
0
04 Oct 2024
Parameter Competition Balancing for Model Merging
Parameter Competition Balancing for Model Merging
Guodong Du
Junlin Lee
Jing Li
Runhua Jiang
Yifei Guo
...
Hanting Liu
S. Goh
Ho-Kin Tang
Daojing He
Min Zhang
MoMe
17
10
0
03 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different
  Initializations and Tasks
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
45
0
0
02 Oct 2024
Realistic Evaluation of Model Merging for Compositional Generalization
Realistic Evaluation of Model Merging for Compositional Generalization
Derek Tam
Yash Kant
Brian Lester
Igor Gilitschenski
Colin Raffel
MoMe
16
5
0
26 Sep 2024
Acceptable Use Policies for Foundation Models
Acceptable Use Policies for Foundation Models
Kevin Klyman
20
14
0
29 Aug 2024
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
  Teacher
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
T. Dao
Thuan Hoang Nguyen
T. Le
D. Vu
Khoi Nguyen
Cuong Pham
Anh Tran
DiffM
26
11
0
26 Aug 2024
Mitigating Catastrophic Forgetting in Language Transfer via Model
  Merging
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
Anton Alexandrov
Veselin Raychev
Mark Niklas Muller
Ce Zhang
Martin Vechev
Kristina Toutanova
MoMe
CLL
KELM
25
13
0
11 Jul 2024
Foundation Model Engineering: Engineering Foundation Models Just as
  Engineering Software
Foundation Model Engineering: Engineering Foundation Models Just as Engineering Software
Dezhi Ran
Mengzhou Wu
Wei Yang
Tao Xie
AI4CE
19
1
0
11 Jul 2024
Unlocking the Potential of Model Merging for Low-Resource Languages
Unlocking the Potential of Model Merging for Low-Resource Languages
Mingxu Tao
Chen Zhang
Quzhe Huang
Tianyao Ma
Songfang Huang
Dongyan Zhao
Yansong Feng
CLL
MoMe
20
3
0
04 Jul 2024
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Rickard Brüel-Gabrielsson
Jiacheng Zhu
Onkar Bhardwaj
Leshem Choshen
Kristjan Greenewald
Mikhail Yurochkin
Justin Solomon
28
5
0
17 Jun 2024
Personalized Pieces: Efficient Personalized Large Language Models
  through Collaborative Efforts
Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts
Zhaoxuan Tan
Zheyuan Liu
Meng-Long Jiang
27
19
0
15 Jun 2024
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin Biggs
Arjun Seshadri
Yang Zou
Achin Jain
Aditya Golatkar
Yusheng Xie
Alessandro Achille
Ashwin Swaminathan
Stefano Soatto
MoMe
DiffM
20
10
0
12 Jun 2024
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
38
30
0
27 May 2024
Learning More Generalized Experts by Merging Experts in
  Mixture-of-Experts
Learning More Generalized Experts by Merging Experts in Mixture-of-Experts
Sejik Park
FedML
CLL
MoMe
17
5
0
19 May 2024
A Federated Learning Approach to Privacy Preserving Offensive Language
  Identification
A Federated Learning Approach to Privacy Preserving Offensive Language Identification
Marcos Zampieri
Damith Premasiri
Tharindu Ranasinghe
FedML
14
2
0
17 Apr 2024
Lossless and Near-Lossless Compression for Foundation Models
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
36
5
0
05 Apr 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
66
75
0
20 Mar 2024
FedFisher: Leveraging Fisher Information for One-Shot Federated Learning
FedFisher: Leveraging Fisher Information for One-Shot Federated Learning
Divyansh Jhunjhunwala
Shiqiang Wang
Gauri Joshi
FedML
16
6
0
19 Mar 2024
Fisher Mask Nodes for Language Model Merging
Fisher Mask Nodes for Language Model Merging
Thennal D K
Ganesh Nathan
Suchithra M S
MoMe
AI4CE
32
3
0
14 Mar 2024
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Ansh Arora
Xuanli He
Maximilian Mozes
Srinibas Swain
Mark Dras
Qiongkai Xu
SILM
MoMe
AAML
39
12
0
29 Feb 2024
Does Combining Parameter-efficient Modules Improve Few-shot Transfer
  Accuracy?
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?
Nader Asadi
Mahdi Beitollahi
Yasser H. Khalil
Yinchuan Li
Guojun Zhang
Xi Chen
MoMe
25
8
0
23 Feb 2024
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned
  Language Models through Task Arithmetic
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
Rishabh Bhardwaj
Do Duc Anh
Soujanya Poria
MoMe
45
35
0
19 Feb 2024
BioMistral: A Collection of Open-Source Pretrained Large Language Models
  for Medical Domains
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
Yanis Labrak
Adrien Bazoge
Emmanuel Morin
P. Gourraud
Mickael Rouvier
Richard Dufour
91
188
0
15 Feb 2024
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Mohammed Muqeeth
Haokun Liu
Yufan Liu
Colin Raffel
MoMe
25
14
0
08 Feb 2024
Multimodal Attention Merging for Improved Speech Recognition and Audio
  Event Classification
Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Anirudh S. Sundar
Chao-Han Huck Yang
David M. Chan
Shalini Ghosh
Venkatesh Ravichandran
P. S. Nidadavolu
MoMe
28
8
0
22 Dec 2023
Weighted Ensemble Models Are Strong Continual Learners
Weighted Ensemble Models Are Strong Continual Learners
Imad Eddine Marouf
Subhankar Roy
Enzo Tartaglione
Stéphane Lathuilière
CLL
24
16
0
14 Dec 2023
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Mohammad-Javad Davari
Eugene Belilovsky
MoMe
19
54
0
11 Dec 2023
Merging by Matching Models in Task Parameter Subspaces
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
13
10
0
07 Dec 2023
ComPEFT: Compression for Communicating Parameter Efficient Updates via
  Sparsification and Quantization
ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Prateek Yadav
Leshem Choshen
Colin Raffel
Mohit Bansal
19
12
0
22 Nov 2023
Fuse to Forget: Bias Reduction and Selective Memorization through Model
  Fusion
Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion
Kerem Zaman
Leshem Choshen
Shashank Srivastava
KELM
MoMe
13
10
0
13 Nov 2023
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial
  Understanding
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Haoxiang Wang
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Mehrdad Farajtabar
Sachin Mehta
Mohammad Rastegari
Oncel Tuzel
Hadi Pouransari
VLM
9
65
0
23 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its
  Routing Policy
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
13
33
0
02 Oct 2023
12
Next