A framework for statistical network modeling

Statistical network models are often specified through their finite sampling distributions, which are usually chosen to accommodate known limiting behavior, such as sparsity, as well as to possess reasonable invariance properties, such as exchangeability. In many cases, it is mathematically impossible to specify a consistent family of finite sampling distributions with the desired invariance and limiting properties, e.g., exchangeability and sparsity are incompatible, creating tension between logical and empirical aspects of network modeling and limiting the scope of statistical inferences from network data. To address these issues, we show that every statistical network model fits into a universal framework: a {\em relatively exchangeable data generating process} governs the genesis of the population network from which a {\em sampling mechanism} generates the observed network data. Under this framework, a statistical network model is specified by a sampling mechanism in addition to its finite sampling distributions. The concept of relative exchangeability is central to the construction, as it permits models that reflect known structural properties and remain valid for inference. The sampling mechanism permits bias, e.g., preferential attachment or size-biased sampling, that is often responsible for heterogeneity in observed network structure. We illustrate the above framework with explicit examples from the literature and discuss implications for inference.
View on arXiv