Restricted Local Differential Privacy for Distribution Estimation with High Data Utility

LDP (Local Differential Privacy) has recently attracted much attention as a privacy metric in the local model, in which individual users obfuscate their own personal data by themselves, and the data collector estimates statistics of the personal data, such as a distribution underlying the data. Although LDP does not require users to trust the data collector, it regards all personal data equally sensitive, which causes excessive obfuscation hence the loss of data utility. In this paper, we introduce the notion of RLDP (Restricted LDP), which provides a privacy guarantee equivalent to LDP only for sensitive data. We first consider the setting in which all users use the same obfuscation mechanism, and propose two mechanisms providing RLDP: restricted RR (Randomized Response) and restricted RAPPOR. We then consider the setting in which the distinction between sensitive and non-sensitive data can be different from user to user. For this setting, we propose a personalized restricted mechanism with semantic tags to keep secret what is sensitive for each user while enabling the data collector to estimate the distribution of personal data with high data tility. We show, both theoretically and experimentally, that our mechanisms provide much higher utility than the existing LDP mechanisms when there are a lot of non-sensitive data. We also show that when most of the data are non-sensitive, our mechanisms provide almost the same utility as non-private mechanisms in the low privacy regime.
View on arXiv