Fast Fault Tolerant Rumor Spreading with Minimum Message Complexity

We study fault tolerant rumor spreading algorithms in the complete graph topology. Our focus is on algorithms that use minimum communication both in a global and local sense: they establish the minimum possible number of inter-processor connections in total, and in each round each processor is involved in at most one connection. The challenge is in designing such algorithms that have an asymptotically optimal, that is, logarithmic, time complexity even in the presence of failed nodes. We first show, via coupling with an intermediate failure model, that if nodes are crashed not adversarially, but independently at random with constant probability less than one, then already the basic GP algorithm of Gasieniec and Pelc (Parallel Computing 22:903--912, 1996) with high probability has an asymptotically optimal time complexity. This improves significantly over the worst-case guarantee of given there for crashed nodes. We then show that by adding randomization to the algorithm, these time and communication complexities can be maintained also against adversarial failures. This is easily achieved by running the GP-algorithm with randomly permuted node labels, at the price, however, of increasing the communication overhead to an average bits per message and with few messages requiring bits. To eliminate this overhead, we show that the random permutation can be chosen from a set of only permutations, for an arbitrary function . Consequently, the permutation can be communicated by adding bits to each message, which is an acceptable overhead produced by many communication protocols, including the GP algorithm. Naturally, this requires all processors to know this set of permutations, which needs space at each processor.
View on arXiv