ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.02404
79
98
v1v2 (latest)

Clustering with Same-Cluster Queries

8 June 2016
H. Ashtiani
Shrinu Kushagra
Shai Ben-David
ArXiv (abs)PDFHTML
Abstract

We propose a framework for Semi-Supervised Active Clustering framework (SSAC), where the learner is allowed to interact with a domain expert, asking whether two given instances belong to the same cluster or not. We study the query and computational complexity of clustering in this framework. We consider a setting where the expert conforms to a center-based clustering with a notion of margin. We show that there is a trade off between computational complexity and query complexity; We prove that for the case of kkk-means clustering (i.e., when the expert conforms to a solution of kkk-means), having access to relatively few such queries allows efficient solutions to otherwise NP hard problems. In particular, we provide a probabilistic polynomial-time (BPP) algorithm for clustering in this setting that asks O(k2log⁡k+klog⁡n)O\big(k^2\log k + k\log n)O(k2logk+klogn) same-cluster queries and runs with time complexity O(knlog⁡n)O\big(kn\log n)O(knlogn) (where kkk is the number of clusters and nnn is the number of instances). The success of the algorithm is guaranteed for data satisfying margin conditions under which, without queries, we show that the problem is NP hard. We also prove a lower bound on the number of queries needed to have a computationally efficient clustering algorithm in this setting.

View on arXiv
Comments on this paper