ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.00748
16
59

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

1 September 2020
Mostafa Mahmoud
Isak Edo Vivancos
Ali Hadi Zadeh
Omar Mohamed Awad
Gennady Pekhimenko
Jorge Albericio
Andreas Moshovos
    MoE
ArXivPDFHTML
Abstract

TensorDash is a hardware level technique for enabling data-parallel MAC units to take advantage of sparsity in their input operand streams. When used to compose a hardware accelerator for deep learning, TensorDash can speedup the training process while also increasing energy efficiency. TensorDash combines a low-cost, sparse input operand interconnect comprising an 8-input multiplexer per multiplier input, with an area-efficient hardware scheduler. While the interconnect allows a very limited set of movements per operand, the scheduler can effectively extract sparsity when it is present in the activations, weights or gradients of neural networks. Over a wide set of models covering various applications, TensorDash accelerates the training process by 1.95×1.95{\times}1.95× while being 1.89×1.89\times1.89× more energy-efficient, 1.6×1.6\times1.6× more energy efficient when taking on-chip and off-chip memory accesses into account. While TensorDash works with any datatype, we demonstrate it with both single-precision floating-point units and bfloat16.

View on arXiv
Comments on this paper