Locally Private Histograms in All Privacy Regimes
Frequency estimation, a.k.a. histograms, is a workhorse of data analysis, and as such has been thoroughly studied under differentially privacy. In particular, computing histograms in the local model of privacy has been the focus of a fruitful recent line of work, and various algorithms have been proposed, achieving the order-optimal error in the high-privacy (small ) regime while balancing other considerations such as time- and communication-efficiency. However, to the best of our knowledge, the picture is much less clear when it comes to the medium- or low-privacy regime (large ), despite its increased relevance in practice. In this paper, we investigate locally private histograms, and the very related distribution learning task, in this medium-to-low privacy regime, and establish near-tight (and somewhat unexpected) bounds on the error achievable. Our theoretical findings emerge from a novel analysis, which appears to improve bounds across the board for the locally private histogram problem. We back our theoretical findings by an empirical comparison of existing algorithms in all privacy regimes, to assess their typical performance and behaviour beyond the worst-case setting.
View on arXiv