ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.14090
16
6

Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

26 August 2024
Daniele De Sensi
Lorenzo Pichetti
Flavio Vella
T. De Matteis
Zebin Ren
Luigi Fusco
M. Turisini
Daniele Cesarini
Kurt Lust
Animesh Trivedi
Duncan Roweth
Filippo Spiga
Salvatore Di Girolamo
Torsten Hoefler
    GNN
ArXivPDFHTML
Abstract

Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on the same node are connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging performance expectations and maximizing system efficiency is challenging due to different technologies, design options, and software layers. This paper comprehensively characterizes three supercomputers - Alps, Leonardo, and LUMI - each with a unique architecture and design. We focus on performance evaluation of intra-node and inter-node interconnects on up to 4096 GPUs, using a mix of intra-node and inter-node benchmarks. By analyzing its limitations and opportunities, we aim to offer practical guidance to researchers, system architects, and software developers dealing with multi-GPU supercomputing. Our results show that there is untapped bandwidth, and there are still many opportunities for optimization, ranging from network to software optimization.

View on arXiv
Comments on this paper