13
49

kk-means as a variational EM approximation of Gaussian mixture models

Abstract

We show that kk-means (Lloyd's algorithm) is obtained as a special case when truncated variational EM approximations are applied to Gaussian Mixture Models (GMM) with isotropic Gaussians. In contrast to the standard way to relate kk-means and GMMs, the provided derivation shows that it is not required to consider Gaussians with small variances or the limit case of zero variances. There are a number of consequences that directly follow from our approach: (A) kk-means can be shown to increase a free energy associated with truncated distributions and this free energy can directly be reformulated in terms of the kk-means objective; (B) kk-means generalizations can directly be derived by considering the 2nd closest, 3rd closest etc. cluster in addition to just the closest one; and (C) the embedding of kk-means into a free energy framework allows for theoretical interpretations of other kk-means generalizations in the literature. In general, truncated variational EM provides a natural and rigorous quantitative link between kk-means-like clustering and GMM clustering algorithms which may be very relevant for future theoretical and empirical studies.

View on arXiv
Comments on this paper