56

An Auto-tuning Method for Run-time Data Transformation for Sparse Matrix-Vector Multiplication

Abstract

In this paper, we research the run-time sparse matrix data transformation from Compressed Row Storage (CRS) to Coordinate (COO) storage and an ELL (ELLPACK/ITPACK) format with OpenMP parallelization for sparse matrix-vector multiplication (SpMV). We propose an auto-tuning (AT) method by using the DmatiD_{mat}^i - RelliR_{ell}^i graph, which plots the derivation/average for the number of non-zero elements per row (DmatiD_{mat}^i) and the ratio, SpMV speedups/transformation time from the CRS to ELL (RelliR_{ell}^i ). The experimental results show the ELL format is very effective in the Earth Simulator 2. The speedup factor of 151 with the ELL-Row inner-parallelized format is obtained. The transformation overhead is also very small, such as 0.01 to 1.0 SpMV time with the CRS format. In addition, the DmatiD_{mat}^i - RelliR_{ell}^i graph can be modeled for the effectiveness of transformation according to the DmatiD_{mat}^i value.

View on arXiv
Comments on this paper