ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.03389
  4. Cited By
An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs
  on Cloud

An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud

8 September 2021
Liang Hu
Jiangcheng Zhu
Zirui Zhou
Ruiqing Cheng
Xiaolong Bai
Yong Zhang
ArXiv (abs)PDFHTML

Papers citing "An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud"

3 / 3 papers shown
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your
  Language Model Thrives on Quality Data
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data
Calvin Tan
Jerome Wang
ALM
326
6
0
07 Aug 2024
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
Baodong Wu
Lei Xia
Qingping Li
Kangyu Li
Xu Chen
Yongqiang Guo
Tieyao Xiang
Yuheng Chen
Shigang Li
379
21
0
16 Oct 2023
End-to-end Adaptive Distributed Training on PaddlePaddle
End-to-end Adaptive Distributed Training on PaddlePaddle
Yulong Ao
Zhihua Wu
Dianhai Yu
Weibao Gong
Zhiqing Kui
...
Yanjun Ma
Tian Wu
Haifeng Wang
Wei Zeng
Chao Yang
294
14
0
06 Dec 2021
1
Page 1 of 1