17
3

Regularized K-means through hard-thresholding

Abstract

We study a framework of regularized KK-means methods based on direct penalization of the size of the cluster centers. Different penalization strategies are considered and compared through simulation and theoretical analysis. Based on the results, we propose HT KK-means, which uses an 0\ell_0 penalty to induce sparsity in the variables. Different techniques for selecting the tuning parameter are discussed and compared. The proposed method stacks up favorably with the most popular regularized KK-means methods in an extensive simulation study. Finally, HT KK-means is applied to several real data examples. Graphical displays are presented and used in these examples to gain more insight into the datasets.

View on arXiv
Comments on this paper