ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.11913
  4. Cited By
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy,
  Challenges and Vision
v1v2v3 (latest)

Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

24 May 2022
Wei Gao
Qi Hu
Zhisheng Ye
Yang Liu
Xiaolin Wang
Yingwei Luo
Tianwei Zhang
Yonggang Wen
ArXiv (abs)PDFHTMLGithub (292★)

Papers citing "Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision"

6 / 6 papers shown
CARMA: Collocation-Aware Resource Manager
CARMA: Collocation-Aware Resource Manager
Ehsan Yousefzadeh-Asl-Miandoab
Reza Karimzadeh
Bulat Ibragimov
145
0
0
26 Aug 2025
Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis
Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis
Jiabo Shi
Yehia Elkhatib
3DHVLM
280
1
0
04 Apr 2025
FlexLLM: Token-Level Co-Serving of LLM Inference and Finetuning with SLO Guarantees
FlexLLM: Token-Level Co-Serving of LLM Inference and Finetuning with SLO Guarantees
Xupeng Miao
Xupeng Miao
Xinhao Cheng
Vineeth Kada
Mengdi Wu
...
April Yang
April Yang
Yingcheng Wang
Colin Unger
Zhihao Jia
MoE
643
16
0
29 Feb 2024
Pollen: High-throughput Federated Learning Simulation via Resource-Aware
  Client Placement
Pollen: High-throughput Federated Learning Simulation via Resource-Aware Client Placement
Lorenzo Sani
Pedro Gusmão
Alexandru Iacob
Wanru Zhao
Xinchi Qiu
Yan Gao
Javier Fernandez-Marques
Nicholas D. Lane
273
1
0
30 Jun 2023
Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A
  Multi-Agent Reinforcement Learning Approach
Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A Multi-Agent Reinforcement Learning ApproachGlobal Communications Conference (GLOBECOM), 2023
Siyue Zhang
Minrui Xu
Wei Yang Bryan Lim
Dusit Niyato
118
20
0
17 Apr 2023
Task Placement and Resource Allocation for Edge Machine Learning: A
  GNN-based Multi-Agent Reinforcement Learning Paradigm
Task Placement and Resource Allocation for Edge Machine Learning: A GNN-based Multi-Agent Reinforcement Learning ParadigmIEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
Yihong Li
Xiaoxi Zhang
Tian Zeng
Jingpu Duan
Chuanxi Wu
Di Wu
Xu Chen
362
54
0
01 Feb 2023
1
Page 1 of 1