ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.06243
  4. Cited By
Deep Reinforcement Agent for Scheduling in HPC
v1v2 (latest)

Deep Reinforcement Agent for Scheduling in HPC

IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2021
11 February 2021
Yuping Fan
Z. Lan
T. Childers
Paul M. Rich
W. Allcock
M. Papka
ArXiv (abs)PDFHTML

Papers citing "Deep Reinforcement Agent for Scheduling in HPC"

13 / 13 papers shown
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
Matthew Sgambati
Aleksandar Vakanski
Matthew Anderson
155
1
0
06 May 2025
A Digital Twin Framework for Liquid-cooled Supercomputers as
  Demonstrated at Exascale
A Digital Twin Framework for Liquid-cooled Supercomputers as Demonstrated at ExascaleInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024
Wesley Brewer
Matthias Maiterth
Vineet Kumar
Rafal Wojda
Sedrick Bouknight
...
Woong Shin
Scott Greenwood
David Grant
Wesley Williams
Feiyi Wang
ELM3DGS
131
16
0
07 Oct 2024
Reinforcement Learning-based Adaptive Mitigation of Uncorrected DRAM
  Errors in the Field
Reinforcement Learning-based Adaptive Mitigation of Uncorrected DRAM Errors in the Field
Isaac Boixaderas
Sergi Moré
Javier Bartolome
David Vicente
Petar Radojković
Paul M. Carpenter
Eduard Ayguadé
88
1
0
23 Jul 2024
A Reinforcement Learning Based Backfilling Strategy for HPC Batch Jobs
A Reinforcement Learning Based Backfilling Strategy for HPC Batch Jobs
Elliot Kolker-Hicks
Di Zhang
Dong Dai
117
11
0
14 Apr 2024
Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on
  Dragonfly Network
Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly Network
Yao Kang
Xin Wang
Z. Lan
210
21
0
24 Mar 2024
MRSch: Multi-Resource Scheduling for HPC
MRSch: Multi-Resource Scheduling for HPC
Boyang Li
Yuping Fan
M. Dearing
Z. Lan
Paul M. Rich
W. Allcock
M. Papka
133
13
0
24 Mar 2024
Deep Back-Filling: a Split Window Technique for Deep Online Cluster Job
  Scheduling
Deep Back-Filling: a Split Window Technique for Deep Online Cluster Job Scheduling
Lingfei Wang
Aaron Harwood
Maria A. Rodriguez
76
5
0
18 Jan 2024
Job Scheduling in High Performance Computing
Job Scheduling in High Performance Computing
Yuping Fan
171
15
0
20 Sep 2021
Hybrid Workload Scheduling on HPC Systems
Hybrid Workload Scheduling on HPC Systems
Yuping Fan
Paul M. Rich
W. Allcock
M. Papka
Z. Lan
147
23
0
12 Sep 2021
On the impact of MDP design for Reinforcement Learning agents in
  Resource Management
On the impact of MDP design for Reinforcement Learning agents in Resource ManagementBrazilian Conference on Intelligent Systems (BRACIS), 2021
Renato Luiz de Freitas Cunha
Luiz Chaimowicz
67
3
0
07 Sep 2021
ROME: A Multi-Resource Job Scheduling Framework for Exascale HPC Systems
ROME: A Multi-Resource Job Scheduling Framework for Exascale HPC Systems
Yuping Fan
104
7
0
18 Aug 2021
BFTrainer: Low-Cost Training of Neural Networks on Unfillable
  Supercomputer Nodes
BFTrainer: Low-Cost Training of Neural Networks on Unfillable Supercomputer Nodes
Zhengchun Liu
R. Kettimuthu
M. Papka
Ian Foster
142
3
0
22 Jun 2021
DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster
  Scheduling
DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster Scheduling
Yuping Fan
Z. Lan
92
19
0
16 May 2021
1