Recovering a Hidden Community Beyond the Spectral Limit in $O(|E| \log^*|V|)$ Time

9 October 2015

Jiaming Xu

Abstract

The stochastic block model for one community with parameters $n, K, p,$ and $q$ is considered: $K$ out of $n$ vertices are in the community; two vertices are connected by an edge with probability $p$ if they are both in the community and with probability $q$ otherwise, where $p > q > 0$ and $p/q$ is assumed to be bounded. An estimator based on observation of the graph $G=(V,E)$ is said to achieve weak recovery if the mean number of misclassified vertices is $o(K)$ as $n \to \infty$ . A critical role is played by the effective signal-to-noise ratio $\lambda=K^2(p-q)^2/((n-K)q).$ In the regime $K=\Theta(n)$ , a na\"{i}ve degree-thresholding algorithm achieves weak recovery in $O(|E|)$ time if $\lambda \to \infty$ , which coincides with the information theoretic possibility of weak recovery. The main focus of the paper is on weak recovery in the sublinear regime $K=o(n)$ and $np = n^{o(1)}.$ It is shown that weak recovery is provided by a belief propagation algorithm running for $\log^\ast(n)+O(1)$ iterations, if $\lambda > 1/e,$ with the total time complexity $O(|E| \log^*n)$ . Conversely, no local algorithm with radius $t$ of interaction satisfying $t = o(\frac{\log n}{\log(2+np)})$ can asymptotically outperform trivial random guessing if $\lambda \leq 1/e.$ By analyzing a linear message-passing algorithm that corresponds to applying power iteration to the non-backtracking matrix of the graph, we provide evidence to suggest that spectral methods fail to provide weak recovery if $\lambda \leq 1.$

View on arXiv

Comments on this paper

Recovering a Hidden Community Beyond the Spectral Limit in O(∣E∣log⁡∗∣V∣)O(|E| \log^*|V|)O(∣E∣log∗∣V∣) Time

Recovering a Hidden Community Beyond the Spectral Limit in $O(|E| \log^*|V|)$ Time