74

Fixed-sized clusters kk-Means

Abstract

We present a kk-means-based clustering algorithm, which optimizes the mean square error, for given cluster sizes. A straightforward application is balanced clustering, where the sizes of each cluster are equal. In the kk-means assignment phase, the algorithm solves an assignment problem using the Hungarian algorithm. This makes the assignment phase time complexity O(n3)O(n^3). This enables clustering of datasets of size more than 5000 points.

View on arXiv
Main:5 Pages
2 Figures
Bibliography:2 Pages
Comments on this paper