185

Privately Learning Mixtures of Axis-Aligned Gaussians

Neural Information Processing Systems (NeurIPS), 2021
Abstract

We consider the problem of learning mixtures of Gaussians under the constraint of approximate differential privacy. We prove that O~(k2dlog3/2(1/δ)/α2ε)\widetilde{O}(k^2 d \log^{3/2}(1/\delta) / \alpha^2 \varepsilon) samples are sufficient to learn a mixture of kk axis-aligned Gaussians in Rd\mathbb{R}^d to within total variation distance α\alpha while satisfying (ε,δ)(\varepsilon, \delta)-differential privacy. This is the first result for privately learning mixtures of unbounded axis-aligned (or even unbounded univariate) Gaussians. If the covariance matrices of each of the Gaussians is the identity matrix, we show that O~(kd/α2+kdlog(1/δ)/αε)\widetilde{O}(kd/\alpha^2 + kd \log(1/\delta) / \alpha \varepsilon) samples are sufficient. Recently, the "local covering" technique of Bun, Kamath, Steinke, and Wu has been successfully used for privately learning high-dimensional Gaussians with a known covariance matrix and extended to privately learning general high-dimensional Gaussians by Aden-Ali, Ashtiani, and Kamath. Given these positive results, this approach has been proposed as a promising direction for privately learning mixtures of Gaussians. Unfortunately, we show that this is not possible. We design a new technique for privately learning mixture distributions. A class of distributions F\mathcal{F} is said to be list-decodable if there is an algorithm that, given "heavily corrupted" samples from fFf\in \mathcal{F}, outputs a list of distributions, F^\widehat{\mathcal{F}}, such that one of the distributions in F^\widehat{\mathcal{F}} approximates ff. We show that if F\mathcal{F} is privately list-decodable, then we can privately learn mixtures of distributions in F\mathcal{F}. Finally, we show axis-aligned Gaussian distributions are privately list-decodable, thereby proving mixtures of such distributions are privately learnable.

View on arXiv
Comments on this paper