ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.10098
37
1

Accelerator-as-a-Service in Public Clouds: An Intra-Host Traffic Management View for Performance Isolation in the Wild

14 July 2024
Jiechen Zhao
Ran Shu
Katie Lim
Zewen Fan
Thomas Anderson
Mingyu Gao
Natalie Enright Jerger
    GNN
ArXivPDFHTML
Abstract

I/O devices in public clouds have integrated increasing numbers of hardware accelerators, e.g., AWS Nitro, Azure FPGA and Nvidia BlueField. However, such specialized compute (1) is not explicitly accessible to cloud users with performance guarantee, (2) cannot be leveraged simultaneously by both providers and users, unlike general-purpose compute (e.g., CPUs). Through ten observations, we present that the fundamental difficulty of democratizing accelerators is insufficient performance isolation support. The key obstacles to enforcing accelerator isolation are (1) too many unknown traffic patterns in public clouds and (2) too many possible contention sources in the datapath. In this work, instead of scheduling such complex traffic on-the-fly and augmenting isolation support on each system component, we propose to model traffic as network flows and proactively re-shape the traffic to avoid unpredictable contention. We discuss the implications of our findings on the design of future I/O management stacks and device interfaces.

View on arXiv
Comments on this paper