Community recovery in non-binary and temporal stochastic block models

This article studies the estimation of latent community memberships from pairwise interactions in a network of nodes, where the observed interactions can be of arbitrary type, including binary, categorical, and vector-valued, and not excluding even more general objects such as time series or spatial point patterns. As a generative model for such data, we introduce a stochastic block model with a general measurable interaction space , for which we derive information-theoretic bounds for the minimum achievable error rate. These bounds yield sharp criteria for the existence of consistent and strongly consistent estimators in terms of data sparsity, statistical similarity between intra- and inter-block interaction distributions, and the shape and size of the interaction space. The general framework makes it possible to study temporal and multiplex networks with , in settings where both and , and the temporal interaction patterns are correlated over time. For temporal Markov interactions, we derive sharp consistency thresholds. We also present fast online estimation algorithms which fully utilise the non-binary nature of the observed data. Numerical experiments on synthetic and real data show that these algorithms rapidly produce accurate estimates even for very sparse data arrays.
View on arXiv