27
2

Complexity of Minimizing Projected-Gradient-Dominated Functions with Stochastic First-order Oracles

Abstract

This work investigates the performance limits of projected stochastic first-order methods for minimizing functions under the (α,τ,X)(\alpha,\tau,\mathcal{X})-projected-gradient-dominance property, that asserts the sub-optimality gap F(x)minxXF(x)F(\mathbf{x})-\min_{\mathbf{x}'\in \mathcal{X}}F(\mathbf{x}') is upper-bounded by τGη,X(x)α\tau\cdot\|\mathcal{G}_{\eta,\mathcal{X}}(\mathbf{x})\|^{\alpha} for some α[1,2)\alpha\in[1,2) and τ>0\tau>0 and Gη,X(x)\mathcal{G}_{\eta,\mathcal{X}}(\mathbf{x}) is the projected-gradient mapping with η>0\eta>0 as a parameter. For non-convex functions, we show that the complexity lower bound of querying a batch smooth first-order stochastic oracle to obtain an ϵ\epsilon-global-optimum point is Ω(ϵ2/α)\Omega(\epsilon^{-{2}/{\alpha}}). Furthermore, we show that a projected variance-reduced first-order algorithm can obtain the upper complexity bound of O(ϵ2/α)\mathcal{O}(\epsilon^{-{2}/{\alpha}}), matching the lower bound. For convex functions, we establish a complexity lower bound of Ω(log(1/ϵ)ϵ2/α)\Omega(\log(1/\epsilon)\cdot\epsilon^{-{2}/{\alpha}}) for minimizing functions under a local version of gradient-dominance property, which also matches the upper complexity bound of accelerated stochastic subgradient methods.

View on arXiv
Comments on this paper