ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.16147
16
23

Nearly-Tight and Oblivious Algorithms for Explainable Clustering

30 June 2021
Buddhima Gamlath
Xinrui Jia
Adam Polak
O. Svensson
ArXivPDFHTML
Abstract

We study the problem of explainable clustering in the setting first formalized by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). A kkk-clustering is said to be explainable if it is given by a decision tree where each internal node splits data points with a threshold cut in a single dimension (feature), and each of the kkk leaves corresponds to a cluster. We give an algorithm that outputs an explainable clustering that loses at most a factor of O(log⁡2k)O(\log^2 k)O(log2k) compared to an optimal (not necessarily explainable) clustering for the kkk-medians objective, and a factor of O(klog⁡2k)O(k \log^2 k)O(klog2k) for the kkk-means objective. This improves over the previous best upper bounds of O(k)O(k)O(k) and O(k2)O(k^2)O(k2), respectively, and nearly matches the previous Ω(log⁡k)\Omega(\log k)Ω(logk) lower bound for kkk-medians and our new Ω(k)\Omega(k)Ω(k) lower bound for kkk-means. The algorithm is remarkably simple. In particular, given an initial not necessarily explainable clustering in Rd\mathbb{R}^dRd, it is oblivious to the data points and runs in time O(dklog⁡2k)O(dk \log^2 k)O(dklog2k), independent of the number of data points nnn. Our upper and lower bounds also generalize to objectives given by higher ℓp\ell_pℓp​-norms.

View on arXiv
Comments on this paper