Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.11666
Cited By
Pipelined Backpropagation at Scale: Training Large Models without Batches
25 March 2020
Atli Kosson
Vitaliy Chiley
Abhinav Venigalla
Joel Hestness
Urs Koster
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pipelined Backpropagation at Scale: Training Large Models without Batches"
4 / 4 papers shown
Title
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
Jaeyong Song
Jinkyu Yim
Jaewon Jung
Hongsun Jang
H. Kim
Youngsok Kim
Jinho Lee
GNN
14
25
0
24 Jan 2023
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Vitaliy Chiley
Vithursan Thangarasa
Abhay Gupta
Anshul Samar
Joel Hestness
D. DeCoste
40
8
0
28 Jun 2022
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
M. Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
11
643
0
09 Apr 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
1