442

Impact of regularization on Spectral Clustering

Information Theory and Applications Workshop (ITA), 2013
Abstract

The performance of spectral clustering is considerably improved via regularization, as demonstrated empirically in \citet{chen2012fitting}. Here, we provide an attempt at quantifying this improvement through theoretical analysis. Under the stochastic block model (SBM), and its extensions, previous results on spectral clustering relied on the minimum degree of the graph being sufficiently large for its good performance. We prove that for an appropriate choice of regularization parameter τ\tau, cluster recovery results can be obtained even in scenarios where the minimum degree is small. More importantly, we show the usefulness of regularization in situations where not all nodes belong to well-defined clusters. Our results rely on the analysis of the spectrum of the Laplacian as a function of τ\tau. As a byproduct of our bounds, we propose a data-driven technique DK-est (standing for estimated Davis-Kahn bounds), for choosing the regularization parameter. This technique is shown to work well through simulations and on a real data set.

View on arXiv
Comments on this paper