On the computational tractability of statistical estimation on amenable graphs

5 April 2019

Abstract

We consider the problem of estimating a vector of discrete variables $(\theta_1,\cdots,\theta_n)$ , based on noisy observations $Y_{uv}$ of the pairs $(\theta_u,\theta_v)$ on the edges of a graph $G=([n],E)$ . This setting comprises a broad family of statistical estimation problems, including group synchronization on graphs, community detection, and low-rank matrix estimation. A large body of theoretical work has established sharp thresholds for weak and exact recovery, and sharp characterizations of the optimal reconstruction accuracy in such models, focusing however on the special case of Erd\"os-R\'enyi-type random graphs. The single most important finding of this line of work is the ubiquity of an information-computation gap. Namely, for many models of interest, a large gap is found between the optimal accuracy achievable by any statistical method, and the optimal accuracy achieved by known polynomial-time algorithms. This gap is robust to small amounts of additional side information revealed about the $\theta_i$ 's. How does the structure of the graph $G$ affect this picture? Is the information-computation gap a general phenomenon or does it only apply to specific families of graphs? We prove that the picture is dramatically different for graph sequences converging to transitive amenable graphs (including, for instance, $d$ -dimensional grids). We consider a model in which an arbitrarily small fraction of the vertex labels is revealed to the algorithm, and show that a linear-time algorithm can achieve reconstruction accuracy that is arbitrarily close to the information-theoretic optimum. We contrast this to the case of random graphs. Indeed, focusing on group synchronization on random regular graphs, we prove that the information-computation gap persists if a small amounts of additional side information revealed about the labels $\theta_i$ 's.

View on arXiv

Comments on this paper