27
43

Local community detection by seed expansion: from conductance to weighted kernel 1-mean optimization

Abstract

In local community detection by seed expansion a single cluster concentrated around few given query nodes in a graph is discovered in a localized way. Conductance is a popular objective function used in many algorithms for local community detection. Algorithms that directly optimize conductance usually add or remove one node at a time to find a local optimum. This amounts to fix a specific neighborhood structure over clusters. A natural way to avoid the problem of choosing a specific neighborhood structure is to use a continuous relaxation of conductance. This paper studies such a continuous relaxation of conductance. We show that in this setting continuous optimization leads to hard clusters. We investigate the relation of conductance with weighted kernel k-means for a single cluster, which leads to the introduction of a weighted kernel 1-mean objective function, called \sigma-conductance, where {\sigma} is a parameter which influences the size of the community. Conductance is obtained by setting {\sigma} to 0. Two algorithms for local optimization of \sigma-conductance based on the expectation maximization and the projected gradient descend method are developed, called EMc and PGDc, respectively. We show that for \sigma=0 EMc corresponds to gradient descend with an infinite step size at each iteration. We design a procedure to automatically select a value for {\sigma}. Performance guarantee for these algorithms is proven for a class of dense communities centered around the seeds and well separated from the rest of the network. On this class we also prove that our algorithms stay localized. A comparative experimental analysis on networks with ground-truth communities is performed using state-of-the-art algorithms based on the graph diffusion method. Our experiments indicate that EMc and PGDc stay localized and produce communities most similar to the ground.

View on arXiv
Comments on this paper