Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap

30 December 2015

Abstract

In a paper that initiated the modern study of the stochastic block model, Decelle et al., backed up Mossel et al., made a fascinating conjecture: Denote by $k$ the number of balanced communities, $a/n$ the probability of connecting inside communities and $b/n$ across, and set $\mathrm{SNR}=|a-b|/\sqrt{k(a+(k-1)b)}$ ; for any $k \geq 2$ , it is possible to detect communities efficiently whenever $\mathrm{SNR}>1$ (the KS threshold), whereas for $k\geq 5$ , it is possible to detect communities information-theoretically for some $\mathrm{SNR}<1$ . Massouli\'e, Mossel et al.\ and Bordenave et al.\ succeeded in proving that the KS threshold is efficiently achievable for $k=2$ , while Mossel et al.\ proved that it cannot be crossed information-theoretically for $k=2$ . The above conjecture remained open for $k \geq 3$ . This paper proves this conjecture. For the efficient part, an acyclic belief propagation (ABP) algorithm is developed and proved to detect communities for any $k$ down the KS threshold in time $O(n \log n)$ . Achieving this requires showing optimality of BP in the presence of cycles and random initialization, a challenge in the realm of graphical models. The paper also connects ABP to a power iteration method on a $r$ -nonbacktracking operator, formalizing the message passing and spectral interplay. Further, it shows that the model can be learned efficiently down the KS threshold, implying that ABP improves upon the state-of-the-art both in terms of complexity and universality. For the information-theoretic (IT) part, a non-efficient algorithm sampling a typical clustering is shown to break down the KS threshold at $k=5$ . The emerging gap is shown to be large in some cases; if $a=0$ , the KS threshold reads $b \gtrsim k^2$ whereas the IT bound reads $b \gtrsim k \ln(k)$ . This thus makes the SBM a good study case for information-computation gaps. The results extend to general SBMs.

View on arXiv

Comments on this paper