8
2
v1v2 (latest)

Modularity and partially observed graphs

Abstract

Suppose that there is an unknown underlying graph GG on a large vertex set, and we can test only a proportion of the possible edges to check whether they are present in GG. If GG has high modularity, is the observed graph GG' likely to have high modularity? We see that this is indeed the case under a mild condition, in a natural model where we test edges at random. We find that q(G)q(G)εq^*(G') \geq q^*(G)-\varepsilon with probability at least 1ε1-\varepsilon, as long as the expected number edges in GG' is large enough. Similarly, q(G)q(G)+εq^*(G') \leq q^*(G)+\varepsilon with probability at least 1ε1-\varepsilon, under the stronger condition that the expected average degree in GG' is large enough. Further, under this stronger condition, finding a good partition for GG' helps us to find a good partition for GG. We also consider the vertex sampling model for partially observing the underlying graph: we find that for dense underlying graphs we may estimate the modularity by sampling constantly many vertices and observing the corresponding induced subgraph, but this does not hold for underlying graphs with a subquadratic number of edges. Finally we deduce some related results, for example showing that under-sampling tends to lead to overestimation of modularity.

View on arXiv
Comments on this paper