32
3

Differentially Private Kernel Density Estimation

Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
Abstract

We introduce a refined differentially private (DP) data structure for kernel density estimation (KDE), offering not only improved privacy-utility tradeoff but also better efficiency over prior results. Specifically, we study the mathematical problem: given a similarity function ff (or DP KDE) and a private dataset XRdX \subset \mathbb{R}^d, our goal is to preprocess XX so that for any query yRdy\in\mathbb{R}^d, we approximate xXf(x,y)\sum_{x \in X} f(x, y) in a differentially private fashion. The best previous algorithm for f(x,y)=xy1f(x,y) =\| x - y \|_1 is the node-contaminated balanced binary tree by [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024]. Their algorithm requires O(nd)O(nd) space and time for preprocessing with n=Xn=|X|. For any query point, the query time is dlognd \log n, with an error guarantee of (1+α)(1+\alpha)-approximation and ϵ1α0.5d1.5Rlog1.5n\epsilon^{-1} \alpha^{-0.5} d^{1.5} R \log^{1.5} n.In this paper, we improve the best previous result [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024] in three aspects:- We reduce query time by a factor of α1logn\alpha^{-1} \log n.- We improve the approximation ratio from α\alpha to 1.- We reduce the error dependence by a factor of α0.5\alpha^{-0.5}.From a technical perspective, our method of constructing the search tree differs from previous work [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024]. In prior work, for each query, the answer is split into α1logn\alpha^{-1} \log n numbers, each derived from the summation of logn\log n values in interval tree countings. In contrast, we construct the tree differently, splitting the answer into logn\log n numbers, where each is a smart combination of two distance values, two counting values, and yy itself. We believe our tree structure may be of independent interest.

View on arXiv
@article{liu2025_2409.01688,
  title={ Differentially Private Kernel Density Estimation },
  author={ Erzhi Liu and Jerry Yao-Chieh Hu and Alex Reneau and Zhao Song and Han Liu },
  journal={arXiv preprint arXiv:2409.01688},
  year={ 2025 }
}
Comments on this paper