Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2002.04013
Cited By
v1
v2
v3 (latest)
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
Neural Information Processing Systems (NeurIPS), 2020
10 February 2020
Max Ryabinin
Anton I. Gusev
FedML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts"
22 / 22 papers shown
All is Not Lost: LLM Recovery without Checkpoints
Nikolay Blagoev
Oğuzhan Ersoy
Lydia Yiyu Chen
262
1
0
18 Jun 2025
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network
Guangxin He
Yuan Cao
Yutong He
Tianyi Bai
Kun Yuan
Binhang Yuan
MQ
230
1
0
02 Jun 2025
Protocol Models: Scaling Decentralized Training with Communication-Efficient Model Parallelism
Sameera Ramasinghe
Thalaiyasingam Ajanthan
Gil Avraham
Yan Zuo
Alexander Long
GNN
462
0
0
02 Jun 2025
Achieving Peak Performance for Large Language Models: A Systematic Review
IEEE Access (IEEE Access), 2024
Z. R. K. Rostam
Sándor Szénási
Gábor Kertész
375
18
0
07 Sep 2024
Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks
D. Kovalev
Ekaterina Borodich
Alexander Gasnikov
Dmitrii Feoktistov
252
4
0
28 May 2024
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
237
4
0
06 Mar 2024
Social Interpretable Reinforcement Learning
Leonardo Lucio Custode
Giovanni Iacca
OffRL
496
2
0
27 Jan 2024
Direct Neural Machine Translation with Task-level Mixture of Experts models
Isidora Chara Tourni
Subhajit Naskar
MoE
281
0
0
18 Oct 2023
Towards Open Federated Learning Platforms: Survey and Vision from Technical and Legal Perspectives
Moming Duan
Qinbin Li
Linshan Jiang
Bingsheng He
FedML
502
5
0
05 Jul 2023
A Language Model of Java Methods with Train/Test Deduplication
Chia-Yi Su
Aakash Bansal
Vijayanta Jain
S. Ghanavati
Collin McMillan
SyDa
VLM
307
16
0
15 May 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
International Conference on Machine Learning (ICML), 2023
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
415
60
0
27 Jan 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
International Conference on Machine Learning (ICML), 2022
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
582
107
0
20 Dec 2022
Petals: Collaborative Inference and Fine-tuning of Large Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Alexander Borzunov
Dmitry Baranchuk
Tim Dettmers
Max Ryabinin
Younes Belkada
Artem Chumachenko
Pavel Samygin
Colin Raffel
VLM
268
107
0
02 Sep 2022
Training Transformers Together
Neural Information Processing Systems (NeurIPS), 2022
Alexander Borzunov
Max Ryabinin
Tim Dettmers
Quentin Lhoest
Lucile Saulnier
Michael Diskin
Yacine Jernite
Thomas Wolf
ViT
178
10
0
07 Jul 2022
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Neural Information Processing Systems (NeurIPS), 2022
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
AI4CE
510
33
0
02 Jun 2022
Decentralized Training of Foundation Models in Heterogeneous Environments
Neural Information Processing Systems (NeurIPS), 2022
Binhang Yuan
Yongjun He
Jared Davis
Tianyi Zhang
Tri Dao
Beidi Chen
Abigail Z. Jacobs
Christopher Ré
Ce Zhang
500
131
0
02 Jun 2022
Machines & Influence: An Information Systems Lens
Shashank Yadav
365
1
0
26 Nov 2021
Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives
IEEE International Conference on Computer Vision (ICCV), 2021
Ben Saunders
Necati Cihan Camgöz
Richard Bowden
SLR
292
79
0
23 Jul 2021
Secure Distributed Training at Scale
International Conference on Machine Learning (ICML), 2021
Eduard A. Gorbunov
Alexander Borzunov
Michael Diskin
Max Ryabinin
FedML
452
18
0
21 Jun 2021
Distributed Deep Learning in Open Collaborations
Neural Information Processing Systems (NeurIPS), 2021
Michael Diskin
Alexey Bukhtiyarov
Max Ryabinin
Lucile Saulnier
Quentin Lhoest
...
Denis Mazur
Ilia Kobelev
Yacine Jernite
Thomas Wolf
Gennady Pekhimenko
FedML
315
74
0
18 Jun 2021
Distributed Deep Learning Using Volunteer Computing-Like Paradigm
IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2021
Medha Atre
B. Jha
Ashwini Rao
416
12
0
16 Mar 2021
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Neural Information Processing Systems (NeurIPS), 2021
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
425
49
0
04 Mar 2021
1
Page 1 of 1