33
12

Influence Maximization with Few Simulations

Abstract

Influence maximization (IM) is the problem of finding a set of ss nodes in a network with maximum influence. We consider models such as the classic Independent Cascade (IC) model of Kempe, Kleinberg, and Tardos \cite{KKT:KDD2003} where influence is defined to be the expectation over {\em simulations} (random sets of live edges) of the size of the reachability set of the seed nodes. In such models the influence function is unbiasedly approximated by an average over a set of i.i.d.\ simulations. A fundamental question is to bound the IM {\em sample complexity}: The number of simulations needed to determine an approximate maximizer. An upper bound of O(snϵ2lnnδ)O( s n \epsilon^{-2} \ln \frac{n}{\delta}), where nn is the number of nodes in the network and 1δ1-\delta the desired confidence applies to all models. We provide a sample complexity bound of O(sτϵ2lnnδ)O( s \tau \epsilon^{-2} \ln \frac{n}{\delta}) for a family of models that includes the IC model, where τ\tau (generally τn\tau \ll n) is the step limit of the diffusion. Algorithmically, we show how to adaptively detect when a smaller number of simulations suffices. We also design an efficient greedy algorithm that computes a (11/eϵ)(1-1/e-\epsilon)-approximate maximizer from simulation averages. Influence maximization from simulation-averages is practically appealing as it is robust to dependencies and modeling errors but was believed to be less efficient than other methods in terms of required data and computation. Our work shows that we can have both robustness and efficiency.

View on arXiv
Comments on this paper