278

Multisection in the Stochastic Block Model using Semidefinite Programming

Abstract

We consider the problem of identifying underlying community-like structures in graphs. Towards this end we study the Stochastic Block Model (SBM) on kk-clusters: a random model on n=kmn=km vertices, partitioned in kk equal sized clusters, with edges sampled independently across clusters with probability qq and within clusters with probability pp, p>qp>q. The goal is to recover the initial "hidden" partition of [n][n]. We study semidefinite programming (SDP) based algorithms in this context. In the regime p=αlog(m)mp = \frac{\alpha \log(m)}{m} and q=βlog(m)mq = \frac{\beta \log(m)}{m} we show that a certain natural SDP based algorithm solves the problem of {\em exact recovery} in the kk-community SBM, with high probability, whenever αβ>1\sqrt{\alpha} - \sqrt{\beta} > \sqrt{1}, as long as k=o(logn)k=o(\log n). This threshold is known to be the information theoretically optimal. We also study the case when k=θ(log(n))k=\theta(\log(n)). In this case however we achieve recovery guarantees that no longer match the optimal condition αβ>1\sqrt{\alpha} - \sqrt{\beta} > \sqrt{1}, thus leaving achieving optimality for this range an open question.

View on arXiv
Comments on this paper