Restricted Local Differential Privacy for Distribution Estimation with High Data Utility

30 July 2018

Abstract

LDP (Local Differential Privacy) has recently attracted much attention as a privacy metric in the local model, in which individual users obfuscate their own personal data by themselves, and the data collector estimates statistics of the personal data, such as a distribution underlying the data. Although LDP does not require users to trust the data collector, it regards all personal data equally sensitive, which causes excessive obfuscation hence the loss of data utility. In this paper, we introduce the notion of RLDP (Restricted Local Differential Privacy), which provides a privacy guarantee equivalent to LDP only for sensitive data. We first consider the setting in which all users use the same obfuscation mechanism, and propose two mechanisms providing RLDP: restricted RR (Randomized Response) and restricted RAPPOR. We then consider the setting in which the distinction between sensitive and non-sensitive data can be different from user to user. For this setting, we propose a personalized restricted mechanism with semantic tags to keep secret what is sensitive for each user while enabling the data collector to estimate the distribution of personal data with high data utility. We prove that the proposed mechanisms provide much higher data utility than the existing LDP mechanisms. We also prove that in the low privacy regime, our mechanisms provide almost the same data utility as non-private mechanisms. Finally, we demonstrate that our mechanisms outperform the existing mechanisms by one or two orders of magnitude using two large-scale datasets.

View on arXiv

Comments on this paper