Local community detection by seed expansion: from conductance to weighted kernel 1-mean optimization

In local community detection by seed expansion a single cluster concentrated around few given query nodes in a graph is discovered in a localized way. Conductance is a popular objective function used in many algorithms for local community detection. Algorithms that directly optimize conductance usually add or remove one node at a time to find a local optimum. This amounts to fix a specific neighborhood structure over clusters. A natural way to avoid the problem of choosing a specific neighborhood structure is to use a continuous relaxation of conductance. This paper studies such a continuous relaxation of conductance. We show that in this setting continuous optimization leads to hard clusters. We investigate the relation of conductance with weighted kernel k-means for a single cluster, which leads to the introduction of a weighted kernel 1-mean objective function, called \sigma-conductance, where {\sigma} is a parameter which influences the size of the community. Conductance is obtained by setting {\sigma} to 0. Two algorithms for local optimization of \sigma-conductance based on the expectation maximization and the projected gradient descend method are developed, called EMc and PGDc, respectively. We show that for \sigma=0 EMc corresponds to gradient descend with an infinite step size at each iteration. We design a procedure to automatically select a value for {\sigma}. Performance guarantee for these algorithms is proven for a class of dense communities centered around the seeds and well separated from the rest of the network. On this class we also prove that our algorithms stay localized. A comparative experimental analysis on networks with ground-truth communities is performed using state-of-the-art algorithms based on the graph diffusion method. Our experiments indicate that EMc and PGDc stay localized and produce communities most similar to the ground.
View on arXiv