85
77

Spinner: Scalable Graph Partitioning for the Cloud

Abstract

Several organizations, like social networks, store and routinely analyze large graphs as part of their daily operation. Such graphs are typically distributed across multiple servers, and graph partitioning is critical for efficient graph management. Existing partitioning algorithms focus on finding graph partitions with good locality, but they disregard the pragmatic challenges of integrating partitioning into large-scale graph management systems deployed on a cloud. In this paper, we aim at a solution that performs substantially better than the most practical solution currently used, hash partitioning, but is nearly as practical. We propose Spinner, a scalable and adaptive graph partitioning algorithm based on label propagation. Spinner scales to massive graphs, produces partitions with locality and balance comparable to the state-of-the-art and efficiently adapts the partitioning upon changes. We describe our fully decentralized algorithm and its implementation in the Pregel programming model that makes it possible to partition billion-vertex graphs. We evaluate Spinner with a variety of synthetic and real graphs and show that it can compute partitions with quality comparable to the state-of-the art. In fact, by integrating Spinner into the Giraph graph analytics engine, we speed up different applications by a factor of 2 relative to standard hash partitioning.

View on arXiv
Comments on this paper