Optimized Tradeoffs for Private Prediction with Majority Ensembling

27 November 2024

Shuli Jiang

ArXiv (abs)PDF HTML Github

Main:13 Pages

10 Figures

Bibliography:2 Pages

10 Tables

Appendix:42 Pages

Abstract

We study a classical problem in private prediction, the problem of computing an $(m\epsilon, \delta)$ -differentially private majority of $K$ $(\epsilon, \Delta)$ -differentially private algorithms for $1 \leq m \leq K$ and $1 > \delta \geq \Delta \geq 0$ . Standard methods such as subsampling or randomized response are widely used, but do they provide optimal privacy-utility tradeoffs? To answer this, we introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm. It is parameterized by a data-dependent noise function $\gamma$ , and enables efficient utility optimization over the class of all private algorithms, encompassing those standard methods. We show that maximizing the utility of an $(m\epsilon, \delta)$ -private majority algorithm can be computed tractably through an optimization problem for any $m \leq K$ by a novel structural result that reduces the infinitely many privacy constraints into a polynomial set. In some settings, we show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility. Lastly, we demonstrate the strong empirical effectiveness of our first-of-its-kind privacy-constrained utility optimization for ensembling labels for private prediction from private teachers in image classification. Notably, our DaRRM framework with an optimized $\gamma$ exhibits substantial utility gains when compared against several baselines.

View on arXiv

Comments on this paper