324
v1v2 (latest)

Nebula: Efficient, Private and Accurate Histogram Estimation

Main:18 Pages
9 Figures
Bibliography:4 Pages
4 Tables
Appendix:4 Pages
Abstract

We present \textit{Nebula}, a system for differentially private histogram estimation on data distributed among clients. \textit{Nebula} allows clients to independently decide whether to participate in the system, and locally encode their data so that an untrusted server only learns data values whose multiplicity exceeds a predefined aggregation threshold, with (ε,δ)(\varepsilon,\delta) differential privacy guarantees. Compared to existing systems, \textit{Nebula} uniquely achieves: \textit{i)} a strict upper bound on client privacy leakage; \textit{ii)} significantly higher utility than standard local differential privacy systems; and \textit{iii)} no requirement for trusted third-parties, multi-party computation, or trusted hardware. We provide a formal evaluation of \textit{Nebula}'s privacy, utility and efficiency guarantees, along with an empirical assessment on three real-world datasets. On the United States Census dataset, clients can submit their data in just 0.0036 seconds and 0.0016 MB (\textbf{efficient}), under strong (ε=1,δ=108)(\varepsilon=1,\delta=10^{-8}) differential privacy guarantees (\textbf{private}), enabling \textit{Nebula}'s untrusted aggregation server to estimate histograms with over 88\% better utility than existing local differential privacy deployments (\textbf{accurate}). Additionally, we describe a variant that allows clients to submit multi-dimensional data, with similar privacy, utility, and performance. Finally, we provide an implementation of \textit{Nebula}.

View on arXiv
Comments on this paper