Differentially Private Histograms in the Shuffle Model from Fake Users

IEEE Symposium on Security and Privacy (IEEE S&P), 2021

6 April 2021

Abstract

There has been much recent work in the shuffle model of differential privacy, particularly for approximate $d$ -bin histograms. While these protocols achieve low error, the number of messages sent by each user -- the message complexity -- has so far scaled with $d$ or the privacy parameters. This is problematic for off-the-shelf implementations of the shuffler. Here, we propose a protocol whose message complexity is two when there are sufficiently many users. The protocol essentially pairs each row in the dataset with a fake row and performs a simple randomization on all rows. Experiments on real-world language confirm that the protocol's error is small. We also bound the impact corrupt users have on the estimates and show that it compares favorably to prior state-of-the-art.

View on arXiv

Comments on this paper