Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2109.03389
Cited By
An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud
8 September 2021
Liang Hu
Jiangcheng Zhu
Zirui Zhou
Ruiqing Cheng
Xiaolong Bai
Yong Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud"
3 / 3 papers shown
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data
Calvin Tan
Jerome Wang
ALM
326
6
0
07 Aug 2024
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
Baodong Wu
Lei Xia
Qingping Li
Kangyu Li
Xu Chen
Yongqiang Guo
Tieyao Xiang
Yuheng Chen
Shigang Li
379
21
0
16 Oct 2023
End-to-end Adaptive Distributed Training on PaddlePaddle
Yulong Ao
Zhihua Wu
Dianhai Yu
Weibao Gong
Zhiqing Kui
...
Yanjun Ma
Tian Wu
Haifeng Wang
Wei Zeng
Chao Yang
294
14
0
06 Dec 2021
1
Page 1 of 1