22
5

Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages

Abstract

We study the problem of private vector mean estimation in the shuffle model of privacy where nn users each have a unit vector v(i)Rdv^{(i)} \in\mathbb{R}^d. We propose a new multi-message protocol that achieves the optimal error using O~(min(nε2,d))\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right) messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send Ω(min(nε2,d)/log(n))\Omega(\min(n\varepsilon^2,d)/\log(n)) messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error O(dnd/(d+2)ε4/(d+2))\mathcal{O}(dn^{d/(d+2)}\varepsilon^{-4/(d+2)}). Moreover, we show that any single-message protocol must incur mean squared error Ω(dnd/(d+2))\Omega(dn^{d/(d+2)}), showing that our protocol is optimal in the standard setting where ε=Θ(1)\varepsilon = \Theta(1). Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler.

View on arXiv
Comments on this paper