Locally Private Histograms in All Privacy Regimes

Information Technology Convergence and Services (ITCS), 2024

9 August 2024

Clément L. Canonne

Abigail Gentle

ArXiv (abs)PDF HTML

Main:27 Pages

4 Figures

Bibliography:4 Pages

1 Tables

Appendix:4 Pages

Abstract

Frequency estimation, a.k.a. histograms, is a workhorse of data analysis, and as such has been thoroughly studied under differentially privacy. In particular, computing histograms in the local model of privacy has been the focus of a fruitful recent line of work, and various algorithms have been proposed, achieving the order-optimal $\ell_\infty$ error in the high-privacy (small $\varepsilon$ ) regime while balancing other considerations such as time- and communication-efficiency. However, to the best of our knowledge, the picture is much less clear when it comes to the medium- or low-privacy regime (large $\varepsilon$ ), despite its increased relevance in practice. In this paper, we investigate locally private histograms, and the very related distribution learning task, in this medium-to-low privacy regime, and establish near-tight (and somewhat unexpected) bounds on the $\ell_\infty$ error achievable. Our theoretical findings emerge from a novel analysis, which appears to improve bounds across the board for the locally private histogram problem. We back our theoretical findings by an empirical comparison of existing algorithms in all privacy regimes, to assess their typical performance and behaviour beyond the worst-case setting.

View on arXiv

Comments on this paper