Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.11948
Cited By
lo-fi: distributed fine-tuning without communication
19 October 2022
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
Re-assign community
ArXiv
PDF
HTML
Papers citing
"lo-fi: distributed fine-tuning without communication"
23 / 23 papers shown
Title
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
Hao Chen
S. Hu
Wayne Luk
Timothy M. Hospedales
Hongxiang Fan
MoMe
67
0
0
16 Mar 2025
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
Anton Alexandrov
Veselin Raychev
Mark Niklas Muller
Ce Zhang
Martin Vechev
Kristina Toutanova
MoMe
CLL
KELM
27
13
0
11 Jul 2024
Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models
Yae Jee Cho
Luyang Liu
Zheng Xu
Aldi Fahrezi
Gauri Joshi
6
45
0
12 Jan 2024
Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Anirudh S. Sundar
Chao-Han Huck Yang
David M. Chan
Shalini Ghosh
Venkatesh Ravichandran
P. S. Nidadavolu
MoMe
30
8
0
22 Dec 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
Kaiyan Zhang
Ning Ding
Biqing Qi
Xuekai Zhu
Xinwei Long
Bowen Zhou
38
3
0
24 Oct 2023
A Quadratic Synchronization Rule for Distributed Deep Learning
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
33
1
0
22 Oct 2023
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
Joel Jang
Seungone Kim
Bill Yuchen Lin
Yizhong Wang
Jack Hessel
Luke Zettlemoyer
Hannaneh Hajishirzi
Yejin Choi
Prithviraj Ammanabrolu
MoMe
26
130
0
17 Oct 2023
Robot Fleet Learning via Policy Merging
Lirui Wang
Kaiqing Zhang
Allan Zhou
Max Simchowitz
Russ Tedrake
34
4
0
02 Oct 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
18
51
0
27 Sep 2023
Tangent Transformers for Composition, Privacy and Removal
Tian Yu Liu
Aditya Golatkar
Stefano Soatto
16
8
0
16 Jul 2023
Tangent Model Composition for Ensembling and Continual Fine-tuning
Tianlin Liu
Stefano Soatto
LRM
MoMe
CLL
6
15
0
16 Jul 2023
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
GNN
31
5
0
18 Jun 2023
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
17
44
0
06 Jun 2023
Scaling Expert Language Models with Unsupervised Domain Discovery
Suchin Gururangan
Margaret Li
M. Lewis
Weijia Shi
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoE
8
46
0
24 Mar 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
16
49
0
09 Feb 2023
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Joel Jang
Seungone Kim
Seonghyeon Ye
Doyoung Kim
Lajanugen Logeswaran
Moontae Lee
Kyungjae Lee
Minjoon Seo
LRM
ALM
16
79
0
07 Feb 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
18
80
0
20 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
14
52
0
02 Dec 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
239
313
0
11 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
Large scale distributed neural network training through online distillation
Rohan Anil
Gabriel Pereyra
Alexandre Passos
Róbert Ormándi
George E. Dahl
Geoffrey E. Hinton
FedML
267
402
0
09 Apr 2018
1