230

Betweenness Centrality is more Parallelizable than Dense Matrix Multiplication

Abstract

General infrastructure and scalable algorithms for sparse matrix multiplication enable succinct high-performance implementation of numerical methods and graph algorithms. We showcase the theoretical and practical quality of novel sparse matrix multiplication routines in Cyclops Tensor Framework (CTF) via MFBC: a Maximal Frontier Betweenness Centrality algorithm. Our sparse matrix multiplication algorithms and consequently MFBC perform asymptotically less communication than previous approaches. For graphs with nn vertices and average degree kk, we show that on pp processors, MFBC performs a factor of p1/3p^{1/3} less communication than known alternatives when k=n/p2/3k=n/p^{2/3}. If pp processors are needed to fit the problem in memory, all costs associated with the algorithm can be reduced by a factor of s=(n/k)ps=(n/k)\sqrt{p} when using spsp processors. For multiplication of square dense matrices only s=ps=\sqrt{p} is achievable. We formulate and implement MFBC for weighted graphs, by leveraging specially-designed monoids and functions. We prove the correctness of the new formulation. CTF allows a parallelism-oblivious C++ implementation of MFBC to achieve good scalability for both extremely sparse and relatively dense graphs. The library automatically searches a space of distributed data decompositions and sparse matrix multiplication algorithms. The resulting code outperforms the well-known CombBLAS library by factors of up to 8 and shows more robust performance. Our design methodology is general and readily extensible to other graph problems.

View on arXiv
Comments on this paper