183

Private Geometric Median

Abstract

In this paper, we study differentially private (DP) algorithms for computing the geometric median (GM) of a dataset: Given nn points, x1,,xnx_1,\dots,x_n in Rd\mathbb{R}^d, the goal is to find a point θ\theta that minimizes the sum of the Euclidean distances to these points, i.e., i=1nθxi2\sum_{i=1}^{n} \|\theta - x_i\|_2. Off-the-shelf methods, such as DP-GD, require strong a priori knowledge locating the data within a ball of radius RR, and the excess risk of the algorithm depends linearly on RR. In this paper, we ask: can we design an efficient and private algorithm with an excess error guarantee that scales with the (unknown) radius containing the majority of the datapoints? Our main contribution is a pair of polynomial-time DP algorithms for the task of private GM with an excess error guarantee that scales with the effective diameter of the datapoints. Additionally, we propose an inefficient algorithm based on the inverse smooth sensitivity mechanism, which satisfies the more restrictive notion of pure DP. We complement our results with a lower bound and demonstrate the optimality of our polynomial-time algorithms in terms of sample complexity.

View on arXiv
Comments on this paper