Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1903.00045
Cited By
Speeding up Deep Learning with Transient Servers
28 February 2019
Shijian Li
R. Walls
Lijie Xu
Tian Guo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Speeding up Deep Learning with Transient Servers"
3 / 3 papers shown
Title
Taming Resource Heterogeneity In Distributed ML Training With Dynamic Batching
S. Tyagi
Prateek Sharma
16
22
0
20 May 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
30
31
0
27 Jan 2023
Characterizing and Modeling Distributed Training with Transient Cloud GPU Servers
Shijian Li
R. Walls
Tian Guo
23
23
0
07 Apr 2020
1