Betweenness Centrality is more Parallelizable than Dense Matrix
Multiplication
General infrastructure and scalable algorithms for sparse matrix multiplication enable succinct high-performance implementation of numerical methods and graph algorithms. We showcase the theoretical and practical quality of novel sparse matrix multiplication routines in Cyclops Tensor Framework (CTF) via MFBC: a Maximal Frontier Betweenness Centrality algorithm. Our sparse matrix multiplication algorithms and consequently MFBC perform asymptotically less communication than previous approaches. For graphs with vertices and average degree , we show that on processors, MFBC performs a factor of less communication than known alternatives when . If processors are needed to fit the problem in memory, all costs associated with the algorithm can be reduced by a factor of when using processors. For multiplication of square dense matrices only is achievable. We formulate and implement MFBC for weighted graphs, by leveraging specially-designed monoids and functions. We prove the correctness of the new formulation. CTF allows a parallelism-oblivious C++ implementation of MFBC to achieve good scalability for both extremely sparse and relatively dense graphs. The library automatically searches a space of distributed data decompositions and sparse matrix multiplication algorithms. The resulting code outperforms the well-known CombBLAS library by factors of up to 8 and shows more robust performance. Our design methodology is general and readily extensible to other graph problems.
View on arXiv