Communication Efficient Algorithms for Top-k Selection Problems

Abstract
We present scalable parallel algorithms with sublinear communication volume and low latency for several fundamental problems related to finding the most relevant elements in a set: the classical selection problem with unsorted input, its variant with locally sorted input, bulk parallel priority queues, multicriteria selection using threshold algorithms, and finding the most frequent objects. All of these algorithms push the owner-computes rule to extremes. The output of these algorithms might unavoidably be unevenly distributed over the processors. We therefore also explain how to redistribute this data with minimal communication.
View on arXivComments on this paper