227

Tight and Robust Private Mean Estimation with Few Users

International Conference on Machine Learning (ICML), 2021
Abstract

In this work, we study high-dimensional mean estimation under user-level differential privacy, and attempt to design an (ϵ,δ)(\epsilon,\delta)-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as O(1ϵlog1δ)O(\frac{1}{\epsilon}\log\frac{1}{\delta}). Interestingly our bound O(1ϵlog1δ)O(\frac{1}{\epsilon}\log\frac{1}{\delta}) on the number of users is independent of the dimension, unlike the previous work that depends polynomially on the dimension, solving a problem left open by Amin et al.~(ICML'2019). Our mechanism enjoys robustness up to the point that even if the information of 49%49\% of the users are corrupted, our final estimation is still approximately accurate. Finally, our results also apply to a broader range of problems such as learning discrete distributions, stochastic convex optimization, empirical risk minimization, and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.

View on arXiv
Comments on this paper