Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.03044
Cited By
Fusing finetuned models for better pretraining
6 April 2022
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
FedML
AI4CE
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fusing finetuned models for better pretraining"
29 / 79 papers shown
Title
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
15
51
0
27 Sep 2023
Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Dean Ninalga
17
2
0
24 Sep 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
27
42
0
30 Jul 2023
Can Model Fusing Help Transformers in Long Document Classification? An Empirical Study
Damith Premasiri
Tharindu Ranasinghe
R. Mitkov
VLM
16
1
0
18 Jul 2023
Tangent Transformers for Composition, Privacy and Removal
Tian Yu Liu
Aditya Golatkar
Stefano Soatto
16
8
0
16 Jul 2023
Tangent Model Composition for Ensembling and Continual Fine-tuning
Tianlin Liu
Stefano Soatto
LRM
MoMe
CLL
6
15
0
16 Jul 2023
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Max Zimmer
Christoph Spiegel
S. Pokutta
MoMe
28
14
0
29 Jun 2023
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
VLM
32
22
0
18 Jun 2023
Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models
Nikhil Kandpal
Brian Lester
Mohammed Muqeeth
Anisha Mascarenhas
Monty Evans
Vishal Baskaran
Tenghao Huang
Haokun Liu
Colin Raffel
VLM
14
10
0
07 Jun 2023
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
17
44
0
06 Jun 2023
TIES-Merging: Resolving Interference When Merging Models
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Mohit Bansal
MoMe
14
244
0
02 Jun 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
30
103
0
22 May 2023
Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks
Alon Jacovi
Avi Caciularu
Omer Goldman
Yoav Goldberg
15
95
0
17 May 2023
ZipIt! Merging Models from Different Tasks without Training
George Stoica
Daniel Bolya
J. Bjorner
Pratik Ramesh
Taylor N. Hearn
Judy Hoffman
VLM
MoMe
36
109
0
04 May 2023
Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies
Daniel Lawson
A. H. Qureshi
MoMe
OffRL
6
13
0
14 Mar 2023
Towards Zero-Shot Functional Compositionality of Language Models
Hangyeol Yu
Myeongho Jeong
Jamin Shin
Hyeongdon Moon
Juneyoung Park
Seungtaek Choi
20
1
0
06 Mar 2023
Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?
Ruisi Cai
Zhenyu (Allen) Zhang
Zhangyang Wang
AAML
OOD
20
12
0
24 Feb 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
16
49
0
09 Feb 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
18
80
0
20 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
13
210
0
19 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
28
421
0
08 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
14
52
0
02 Dec 2022
Where to start? Analyzing the potential value of intermediate models
Leshem Choshen
Elad Venezian
Shachar Don-Yehiya
Noam Slonim
Yoav Katz
MoMe
17
27
0
31 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
13
24
0
19 Oct 2022
Patching open-vocabulary models by interpolating weights
Gabriel Ilharco
Mitchell Wortsman
S. Gadre
Shuran Song
Hannaneh Hajishirzi
Simon Kornblith
Ali Farhadi
Ludwig Schmidt
VLM
KELM
14
166
0
10 Aug 2022
Diverse Weight Averaging for Out-of-Distribution Generalization
Alexandre Ramé
Matthieu Kirchmeyer
Thibaud Rahier
A. Rakotomamonjy
Patrick Gallinari
Matthieu Cord
OOD
186
128
0
19 May 2022
On Neurons Invariant to Sentence Structural Changes in Neural Machine Translation
Gal Patel
Leshem Choshen
Omri Abend
20
2
0
06 Oct 2021
Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes
James Lucas
Juhan Bae
Michael Ruogu Zhang
Stanislav Fort
R. Zemel
Roger C. Grosse
MoMe
146
28
0
22 Apr 2021
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
252
618
0
04 Dec 2018
Previous
1
2