Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on
Edge GPUACM International Conference on Embedded Networked Sensor Systems (SenSys), 2023 |
D-STACK: High Throughput DNN Inference by Effective Multiplexing and
Spatio-Temporal Scheduling of GPUsIEEE Transactions on Cloud Computing (IEEE TCC), 2023 |
iGniter: Interference-Aware GPU Resource Provisioning for Predictable
DNN Inference in the CloudIEEE Transactions on Parallel and Distributed Systems (TPDS), 2022 |
Batched matrix operations on distributed GPUs with application in
theoretical physicsInternational Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2022 |
Boggart: Towards General-Purpose Acceleration of Retrospective Video
AnalyticsSymposium on Networked Systems Design and Implementation (NSDI), 2021 |
Contention-Aware GPU Partitioning and Task-to-Partition Allocation for
Real-Time WorkloadsInternational Conference on Real-Time and Network Systems (RTNS), 2021 |