v1v2 (latest)

A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem

Annual Review of Statistics and Its Application (ARSIA), 2024

14 September 2024

Weijie J. Su

ArXiv (abs)PDF HTML

Main:16 Pages

3 Figures

Bibliography:5 Pages

3 Tables

Abstract

Differential privacy is widely considered the formal privacy for privacy-preserving data analysis due to its robust and rigorous guarantees, with increasingly broad adoption in public services, academia, and industry. Despite originating in the cryptographic context, in this review paper we argue that, fundamentally, differential privacy can be considered a \textit{pure} statistical concept. By leveraging David Blackwell's informativeness theorem, our focus is to demonstrate based on prior work that all definitions of differential privacy can be formally motivated from a hypothesis testing perspective, thereby showing that hypothesis testing is not merely convenient but also the right language for reasoning about differential privacy. This insight leads to the definition of $f$ -differential privacy, which extends other differential privacy definitions through a representation theorem. We review techniques that render $f$ -differential privacy a unified framework for analyzing privacy bounds in data analysis and machine learning. Applications of this differential privacy definition to private deep learning, private convex optimization, shuffled mechanisms, and U.S.\ Census data are discussed to highlight the benefits of analyzing privacy bounds under this framework compared to existing alternatives.

View on arXiv

Comments on this paper