ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.10827
28
4
v1v2v3 (latest)

On Euclidean kkk-Means Clustering with ααα-Center Proximity

28 April 2018
Amit Deshpande
Anand Louis
A. Singh
ArXiv (abs)PDFHTML
Abstract

The kkk-means is a popular clustering objective that is NP-hard in the worst-case but often solved efficiently by simple heuristics in practice. The implicit assumption behind using the kkk-means (or many other objectives) is that an optimal solution would recover the underlying ground truth clustering. In most real-world datasets, the underlying ground-truth clustering is unambiguous and stable under small perturbations of data. As a consequence, the ground-truth clustering satisfies center proximity, that is, every point is closer to the center of its own cluster than the center of any other cluster, by some multiplicative factor α>1\alpha > 1α>1. We study the problem of minimizing the Euclidean kkk-means objective only over clusterings that satisfy α\alphaα-center proximity. We give a simple algorithm to find an exact optimal clustering for the above objective with running time exponential in kkk and 1/(α−1)1/(\alpha - 1)1/(α−1) but linear in the number of points and the dimension. We define an analogous α\alphaα-center proximity condition for outliers, and give similar algorithmic guarantees for kkk-means with outliers and α\alphaα-center proximity. On the hardness side we show that for any α′>1\alpha' > 1α′>1, there exists an α≤α′\alpha \leq \alpha'α≤α′, (α>1)(\alpha >1)(α>1), and an ε0>0\varepsilon_0 > 0ε0​>0 such that minimizing the kkk-means objective over clusterings that satisfy α\alphaα-center proximity is NP-hard to approximate within a multiplicative (1+ε0)(1+\varepsilon_0)(1+ε0​) factor.

View on arXiv
Comments on this paper