Influence Maximization with Few Simulations

31 July 2019

Abstract

Influence maximization (IM) is the problem of finding a set of $s$ nodes in a network with maximum influence. We consider models such as the classic Independent Cascade (IC) model of Kempe, Kleinberg, and Tardos \cite{KKT:KDD2003} where influence is defined to be the expectation over {\em simulations} (random sets of live edges) of the size of the reachability set of the seed nodes. In such models the influence function is unbiasedly approximated by an average over a set of i.i.d.\ simulations. A fundamental question is to bound the IM {\em sample complexity}: The number of simulations needed to determine an approximate maximizer. An upper bound of $O( s n \epsilon^{-2} \ln \frac{n}{\delta})$ , where $n$ is the number of nodes in the network and $1-\delta$ the desired confidence applies to all models. We provide a sample complexity bound of $O( s \tau \epsilon^{-2} \ln \frac{n}{\delta})$ for a family of models that includes the IC model, where $\tau$ (generally $\tau \ll n$ ) is the step limit of the diffusion. Algorithmically, we show how to adaptively detect when a smaller number of simulations suffices. We also design an efficient greedy algorithm that computes a $(1-1/e-\epsilon)$ -approximate maximizer from simulation averages. Influence maximization from simulation-averages is practically appealing as it is robust to dependencies and modeling errors but was believed to be less efficient than other methods in terms of required data and computation. Our work shows that we can have both robustness and efficiency.

View on arXiv

Comments on this paper