ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.00688
16
0

Supercomputer 3D Digital Twin for User Focused Real-Time Monitoring

1 October 2024
William Bergeron
Matthew Hubbell
Daniel Mojica
Albert Reuther
William Arcand
David Bestor
Daniel Burrill
Chansup
Byun
V. Gadepally
Michael Houle
Hayden Jananthan
Michael Jeffrey Jones
Piotr Luszczek
Peter Michaleas
Lauren Milechin
Julie Mullen Andrew Prout
Antonio Rosa
Charles Yee
Jeremy Kepner
ArXivPDFHTML
Abstract

Real-time supercomputing performance analysis is a critical aspect of evaluating and optimizing computational systems in a dynamic user environment. The operation of supercomputers produce vast quantities of analytic data from multiple sources and of varying types so compiling this data in an efficient matter is critical to the process. MIT Lincoln Laboratory Supercomputing Center has been utilizing the Unity 3D game engine to create a Digital Twin of our supercomputing systems for several years to perform system monitoring. Unity offers robust visualization capabilities making it ideal for creating a sophisticated representation of the computational processes. As we scale the systems to include a diversity of resources such as accelerators and the addition of more users, we need to implement new analysis tools for the monitoring system. The workloads in research continuously change, as does the capability of Unity, and this allows us to adapt our monitoring tools to scale and incorporate features enabling efficient replay of system wide events, user isolation, and machine level granularity. Our system fully takes advantage of the modern capabilities of the Unity Engine in a way that intuitively represents the real time workload performed on a supercomputer. It allows HPC system engineers to quickly diagnose usage related errors with its responsive user interface which scales efficiently with large data sets.

View on arXiv
Comments on this paper