10
12

ReSQueing Parallel and Private Stochastic Convex Optimization

Abstract

We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in Rd\mathbb{R}^d, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error ϵopt\epsilon_{\text{opt}} with d1/3ϵopt2/3d^{1/3}\epsilon_{\text{opt}}^{-2/3} gradient oracle query depth and d1/3ϵopt2/3+ϵopt2d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2} gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For ϵopt[d1,d1/4]\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}], our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. Given nn samples of Lipschitz loss functions, prior works [BFTT19, BFGT20, AFKT21, KLL21] established that if ndϵdp2n \gtrsim d \epsilon_{\text{dp}}^{-2}, (ϵdp,δ)(\epsilon_{\text{dp}}, \delta)-differential privacy is attained at no asymptotic cost to the SCO utility. However, these prior works all required a superlinear number of gradient queries. We close this gap for sufficiently large nd2ϵdp3n \gtrsim d^2 \epsilon_{\text{dp}}^{-3}, by using ReSQue to design an algorithm with near-linear gradient query complexity in this regime.

View on arXiv
Comments on this paper