Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.07788
Cited By
Activations and Gradients Compression for Model-Parallel Training
15 January 2024
Mikhail Rudakov
Aleksandr Beznosikov
Yaroslav Kholodov
Alexander Gasnikov
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Activations and Gradients Compression for Model-Parallel Training"
2 / 2 papers shown
Title
Two-level overlapping additive Schwarz preconditioner for training scientific machine learning applications
Youngkyu Lee
Alena Kopanicáková
George Karniadakis
AI4CE
41
0
0
16 Jun 2024
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
1