202

Sparse Matrix Multiplication in the Congested Clique model

Abstract

We show how to multiply two n×nn \times n matrices SS and TT over semirings in the \clique model, where nn nodes communicate in a fully connected synchronous network using O(logn)O(\log{n})-bit messages, within O(nz(S)1/3nz(T)1/3/n)O(nz(S)^{1/3} nz(T)^{1/3}/n) rounds of communication, where nz(S)nz(S) and nz(T)nz(T) denote the number of non-zero elements in SS and TT, respectively. By leveraging the sparsity of the input matrices, our algorithm greatly reduces communication costs compared with general multiplication algorithms [Censor-Hillel et al., PODC 2015], and thus improves upon the state-of-the-art for matrices with o(n2)o(n^2) non-zero elements. Moreover, our algorithm exhibits the additional strength of surpassing previous solutions also in the case where only one of the two matrices is such. Particularly, this allows to efficiently raise a sparse matrix to a power greater than 2. Our algorithmic contribution is a new \emph{deterministic} method of restructuring the input matrices in a sparsity-aware manner, which assigns each node with element-wise multiplication tasks that are not necessarily consecutive but guarantee a balanced element distribution, providing for communication-efficient multiplication. As applications, we show how to speed up the computation on non-dense graphs of triangle- and 44-cycle counting, as well as of all-pairs-shortest-paths. Our triangle-counting algorithm completes in O(m2/3/n)O(m^{2/3}/n) rounds, which is a \emph{cubic} improvement over the previously known O(m2/n3)O(m^2/n^3)-round algorithm [Dolev et al., DISC 2012].

View on arXiv
Comments on this paper