ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.02732
  4. Cited By
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning

Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning

Neural Information Processing Systems (NeurIPS), 2020
4 December 2020
Woosuk Kwon
Gyeong-In Yu
Eunji Jeong
Byung-Gon Chun
ArXiv (abs)PDFHTML

Papers citing "Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning"

17 / 17 papers shown
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous
  GPU Clusters
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous GPU ClustersAAAI Conference on Artificial Intelligence (AAAI), 2024
WenZheng Zhang
Yang Hu
Jing Shi
Xiaoying Bai
224
5
0
22 Aug 2024
Orchestrating Quantum Cloud Environments with Qonductor
Orchestrating Quantum Cloud Environments with Qonductor
Emmanouil Giortamis
Francisco Romao
Nathaniel Tornow
Dmitry Lugovoy
Pramod Bhatotia
285
9
0
08 Aug 2024
Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows
Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows
Yuting Yang
Andrea Merlina
Weijia Song
Tiancheng Yuan
Ken Birman
Roman Vitenberg
232
0
0
27 Feb 2024
A Differentiable Framework for End-to-End Learning of Hybrid Structured
  Compression
A Differentiable Framework for End-to-End Learning of Hybrid Structured Compression
Moonjung Eo
Suhyun Kang
Wonjong Rhee
282
2
0
21 Sep 2023
ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via
  Learned Finite State Machines
ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State MachinesInternational Conference on Machine Learning (ICML), 2023
Siyuan Chen
Pratik Fegade
Tianqi Chen
Phillip B. Gibbons
T. Mowry
192
1
0
08 Feb 2023
Baechi: Fast Device Placement of Machine Learning Graphs
Baechi: Fast Device Placement of Machine Learning GraphsACM Symposium on Cloud Computing (SoCC), 2020
Beomyeol Jeon
L. Cai
Chirag Shetty
P. Srivastava
Jintao Jiang
Xiaolan Ke
Yitao Meng
Cong Xie
Indranil Gupta
GNN
181
22
0
20 Jan 2023
PiPAD: Pipelined and Parallel Dynamic GNN Training on GPUs
PiPAD: Pipelined and Parallel Dynamic GNN Training on GPUsACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPoPP), 2023
Chunyang Wang
Desen Sun
Yunru Bai
GNNAI4CE
307
26
0
01 Jan 2023
A Fast Post-Training Pruning Framework for Transformers
A Fast Post-Training Pruning Framework for TransformersNeural Information Processing Systems (NeurIPS), 2022
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
282
212
0
29 Mar 2022
Pathways: Asynchronous Distributed Dataflow for ML
Pathways: Asynchronous Distributed Dataflow for MLConference on Machine Learning and Systems (MLSys), 2022
P. Barham
Aakanksha Chowdhery
J. Dean
Sanjay Ghemawat
Steven Hand
...
Parker Schuh
Ryan Sepassi
Laurent El Shafey
C. A. Thekkath
Yonghui Wu
GNNMoE
374
150
0
23 Mar 2022
Optimal channel selection with discrete QCQP
Optimal channel selection with discrete QCQPInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Yeonwoo Jeong
Deokjae Lee
Gaon An
Changyong Son
Hyun Oh Song
248
1
0
24 Feb 2022
Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning
  Programs
Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning ProgramsNeural Information Processing Systems (NeurIPS), 2022
Taebum Kim
Eunji Jeong
Geonyong Kim
Yunmo Koo
Sehoon Kim
Gyeong-In Yu
Byung-Gon Chun
AI4CE
217
6
0
23 Jan 2022
Safe and Practical GPU Acceleration in TrustZone
Safe and Practical GPU Acceleration in TrustZone
Heejin Park
F. Lin
168
4
0
04 Nov 2021
Scheduling Optimization Techniques for Neural Network Training
Scheduling Optimization Techniques for Neural Network Training
Hyungjun Oh
Junyeol Lee
HyeongJu Kim
Jiwon Seo
173
1
0
03 Oct 2021
Characterizing Concurrency Mechanisms for NVIDIA GPUs under Deep
  Learning Workloads
Characterizing Concurrency Mechanisms for NVIDIA GPUs under Deep Learning Workloads
Guin Gilman
R. Walls
GNNBDL
238
20
0
01 Oct 2021
Multi-model Machine Learning Inference Serving with GPU Spatial
  Partitioning
Multi-model Machine Learning Inference Serving with GPU Spatial Partitioning
S. Choi
Sunho Lee
Yeonjae Kim
Jongse Park
Youngjin Kwon
Jaehyuk Huh
200
27
0
01 Sep 2021
GPUReplay: A 50-KB GPU Stack for Client ML
GPUReplay: A 50-KB GPU Stack for Client MLInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021
Heejin Park
F. Lin
310
11
0
04 May 2021
IOS: Inter-Operator Scheduler for CNN Acceleration
IOS: Inter-Operator Scheduler for CNN AccelerationConference on Machine Learning and Systems (MLSys), 2020
Yaoyao Ding
Ligeng Zhu
Zhihao Jia
Gennady Pekhimenko
Song Han
311
84
0
02 Nov 2020
1
Page 1 of 1