258

DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning

International Conference on Machine Learning (ICML), 2023
Abstract

Differential private optimization for nonconvex smooth objective is considered. In the previous work, the best known utility bound is O~(d/(nεDP))\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP})) in terms of the squared full gradient norm, which is achieved by Differential Private Gradient Descent (DP-GD) as an instance, where nn is the sample size, dd is the problem dimensionality and εDP\varepsilon_\mathrm{DP} is the differential privacy parameter. To improve the best known utility bound, we propose a new differential private optimization framework called \emph{DIFF2 (DIFFerential private optimization via gradient DIFFerences)} that constructs a differential private global gradient estimator with possibly quite small variance based on communicated \emph{gradient differences} rather than gradients themselves. It is shown that DIFF2 with a gradient descent subroutine achieves the utility of O~(d2/3/(nεDP)4/3)\widetilde O(d^{2/3}/(n\varepsilon_\mathrm{DP})^{4/3}), which can be significantly better than the previous one in terms of the dependence on the sample size nn. To the best of our knowledge, this is the first fundamental result to improve the standard utility O~(d/(nεDP))\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP})) for nonconvex objectives. Additionally, a more computational and communication efficient subroutine is combined with DIFF2 and its theoretical analysis is also given. Numerical experiments are conducted to validate the superiority of DIFF2 framework.

View on arXiv
Comments on this paper