14
0

Reveal-or-Obscure: A Differentially Private Sampling Algorithm for Discrete Distributions

Abstract

We introduce a differentially private (DP) algorithm called reveal-or-obscure (ROO) to generate a single representative sample from a dataset of nn observations drawn i.i.d. from an unknown discrete distribution PP. Unlike methods that add explicit noise to the estimated empirical distribution, ROO achieves ϵ\epsilon-differential privacy by randomly choosing whether to "reveal" or "obscure" the empirical distribution. While ROO is structurally identical to Algorithm 1 proposed by Cheu and Nayak (arXiv:2412.10512), we prove a strictly better bound on the sampling complexity than that established in Theorem 12 of (arXiv:2412.10512). To further improve the privacy-utility trade-off, we propose a novel generalized sampling algorithm called Data-Specific ROO (DS-ROO), where the probability of obscuring the empirical distribution of the dataset is chosen adaptively. We prove that DS-ROO satisfies ϵ\epsilon-DP, and provide empirical evidence that DS-ROO can achieve better utility under the same privacy budget of vanilla ROO.

View on arXiv
@article{tasnim2025_2504.14696,
  title={ Reveal-or-Obscure: A Differentially Private Sampling Algorithm for Discrete Distributions },
  author={ Naima Tasnim and Atefeh Gilani and Lalitha Sankar and Oliver Kosut },
  journal={arXiv preprint arXiv:2504.14696},
  year={ 2025 }
}
Comments on this paper