107

GrateTile: Efficient Sparse Tensor Tiling for CNN Processing

IEEE Workshop on Signal Processing Systems (SiPS), 2020
Abstract

We propose GrateTile, an efficient, hardwarefriendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage.

View on arXiv
Comments on this paper