251

The Hidden Subgraph Problem

Abstract

We introduce a statistical model for the problem of finding a subgraph with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these type of networks, small subgraphs with a specific structure have important functional roles. Within our model, a single copy of a subgraph is added (`planted') in an Erd\H{o}s-Renyi random graph with nn vertices and edge probability q0q_0. We ask whether the resulting graph can be distinguished reliably from a pure Erd\H{o}s-Renyi random graph, and present two types of result. First we investigate the question from a purely statistical perspective, and ask whether there is \emph{any} test that can distinguish between the two graph models. We provide necessary and sufficient conditions that are essentially tight for subgraphs of size asymptotically smaller than n2/5n^{2/5}. Next we study two polynomial-time algorithms for solving the same problem: a spectral algorithm, and a semidefinite programming (SDP) relaxation. For the spectral algorithm, we establish sufficient conditions under which it distinguishes the two graph models with high probability. Under the same conditions the spectral algorithm indeed identifies the hidden subgraph. The spectral algorithm is substantially sub-optimal with respect to the optimal test. We show that a similar gap is present for the SDP approach. This points at a large gap between statistical and computational limits for this problem.

View on arXiv
Comments on this paper