ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.11617
28
0
v1v2v3 (latest)

A Scalable and Energy Efficient GPU Thread Map for Standard m-Simplex Domains

24 August 2022
C. Navarro
Felipe A. Quezada
B. Bustos
N. Hitschfeld-Kahler
R. Kindelan
ArXiv (abs)PDFHTML
Abstract

This work proposes a new GPU thread map for standard mmm-simplex domains, that scales its speedup with dimension and is energy efficient compared to other state of the art approaches. The main contributions of this work are the formulation of the new block-space map H:Zm↦Zm\mathcal{H}: \mathbb{Z}^m \mapsto \mathbb{Z}^mH:Zm↦Zm which is analyzed in terms of resource usage, and its experimental evaluation in terms of speedup over a bounding box approach and energy efficiency as elements per second per Watt. Results from the analysis show that H\mathcal{H}H has a potential speedup of up to 2×2\times2× and 6×6\times6× for 222 and 333-simplices, respectively. Experimental evaluation shows that H\mathcal{H}H is competitive for 222-simplices, reaching 1.2×∼2.0×1.2\times \sim 2.0\times1.2×∼2.0× of speedup for different tests, which is on par with the fastest state of the art approaches. For 333-simplices H\mathcal{H}H reaches up to 1.3×∼6.0×1.3\times \sim 6.0\times1.3×∼6.0× of speedup making it the fastest of all. The extension of H\mathcal{H}H to higher dimensional mmm-simplices is feasible and has a potential speedup that scales as m!m!m! given a proper selection of parameters r,βr, \betar,β which are the scaling and replication factors, respectively. In terms of energy consumption, although H\mathcal{H}H is among the highest in power consumption, it compensates by its short duration, making it one of the most energy efficient approaches. Lastly, further improvements with Tensor and Ray Tracing Cores are analyzed, giving insights to leverage each one of them. The results obtained in this work show that H\mathcal{H}H is a scalable and energy efficient map that can contribute to the efficiency of GPU applications when they need to process standard mmm-simplex domains, such as Cellular Automata or PDE simulations.

View on arXiv
Comments on this paper