Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.10065
Cited By
Efficient Strong Scaling Through Burst Parallel Training
19 December 2021
S. Park
Joshua Fried
Sunghyun Kim
Mohammad Alizadeh
Adam Belay
GNN
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Strong Scaling Through Burst Parallel Training"
1 / 1 papers shown
Title
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
1