Privacy of Aggregated Data without Noise

25 May 2016

Abstract

Some recent papers demonstrate how to reveal aggregated sensitive data by combining addition of noise with special cryptographic encoding. These results guarantee security in terms of differential privacy. Such approach is problematic since in some scenarios getting a noisy result is unacceptable, even if the calibration of the noise is optimal. Moreover, adding noise generated from a complex distribution can be problematic for practical reasons -- in particular when the data is collected in a system of constrained devices. The unwanted noise cannot be avoided when standard assumptions for differential privacy are applied to the investigated system. On the other hand, the intuition is that in some natural scenarios revealing only the aggregated sum (without adding any noise) to the adversary does not leak any sensitive information about the individuals who contributed to the data. Informally speaking, this observation comes from the fact that if at least a part of the data is randomized from the adversary's point of view, it can be effectively used for covering other values. In this paper we follow this intuition and formally prove it for some types of the data. We investigate when the aggregated data (without adding noise) does not devastate privacy of users. This leads to a natural extension of classic differential privacy definition that allows utilizing randomized nature of some data sets. In other words, we show when the data has enough "uncertainty" to be self-secured without any statistical perturbation.

View on arXiv

Comments on this paper