7
4

Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces

Abstract

We consider the well-studied Robust (k,z)(k, z)-Clustering problem, which generalizes the classic kk-Median, kk-Means, and kk-Center problems. Given a constant z1z\ge 1, the input to Robust (k,z)(k, z)-Clustering is a set PP of nn weighted points in a metric space (M,δ)(M,\delta) and a positive integer kk. Further, each point belongs to one (or more) of the mm many different groups S1,S2,,SmS_1,S_2,\ldots,S_m. Our goal is to find a set XX of kk centers such that maxi[m]pSiw(p)δ(p,X)z\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z is minimized. This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of O(logm/loglogm)O(\log m/\log\log m) is known [Makarychev, Vakilian, COLT 20212021], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a (3z+ϵ)(3^z+\epsilon)-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023]. Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant η0>0.0006\eta_0 >0.0006, we devise a 3z(1η0)3^z(1-\eta_{0})-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of kk-Center in dimension Θ(logn)\Theta(\log n) is (3/2o(1))(\sqrt{3/2}- o(1))-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT (1+ϵ)(1+\epsilon)-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.

View on arXiv
Comments on this paper