ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.04013
  4. Cited By
Towards Crowdsourced Training of Large Neural Networks using
  Decentralized Mixture-of-Experts

Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts

10 February 2020
Max Ryabinin
Anton I. Gusev
    FedML
ArXivPDFHTML

Papers citing "Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts"

13 / 13 papers shown
Title
Nesterov Method for Asynchronous Pipeline Parallel Optimization
Nesterov Method for Asynchronous Pipeline Parallel Optimization
Thalaiyasingam Ajanthan
Sameera Ramasinghe
Yan Zuo
Gil Avraham
Alexander Long
24
0
0
02 May 2025
DiPaCo: Distributed Path Composition
DiPaCo: Distributed Path Composition
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
A. Kuncoro
Yani Donchev
Rachita Chhaparia
Ionel Gog
MarcÁurelio Ranzato
Jiajun Shen
Arthur Szlam
MoE
40
2
0
15 Mar 2024
Video Relationship Detection Using Mixture of Experts
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
30
0
0
06 Mar 2024
Social Interpretable Reinforcement Learning
Social Interpretable Reinforcement Learning
Leonardo Lucio Custode
Giovanni Iacca
OffRL
35
2
0
27 Jan 2024
A Language Model of Java Methods with Train/Test Deduplication
A Language Model of Java Methods with Train/Test Deduplication
Chia-Yi Su
Aakash Bansal
Vijayanta Jain
S. Ghanavati
Collin McMillan
SyDa
VLM
21
9
0
15 May 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly
  Communication-Efficient
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
22
31
0
27 Jan 2023
Decentralized Training of Foundation Models in Heterogeneous
  Environments
Decentralized Training of Foundation Models in Heterogeneous Environments
Binhang Yuan
Yongjun He
Jared Davis
Tianyi Zhang
Tri Dao
Beidi Chen
Percy Liang
Christopher Ré
Ce Zhang
20
90
0
02 Jun 2022
Mixed SIGNals: Sign Language Production via a Mixture of Motion
  Primitives
Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives
Ben Saunders
Necati Cihan Camgöz
Richard Bowden
SLR
25
50
0
23 Jul 2021
Distributed Deep Learning Using Volunteer Computing-Like Paradigm
Distributed Deep Learning Using Volunteer Computing-Like Paradigm
Medha Atre
B. Jha
Ashwini Rao
15
11
0
16 Mar 2021
Moshpit SGD: Communication-Efficient Decentralized Training on
  Heterogeneous Unreliable Devices
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
27
31
0
04 Mar 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
228
4,460
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
Analyzing Federated Learning through an Adversarial Lens
Analyzing Federated Learning through an Adversarial Lens
A. Bhagoji
Supriyo Chakraborty
Prateek Mittal
S. Calo
FedML
177
1,032
0
29 Nov 2018
1