62
14
v1v2v3 (latest)

Finding Second-Order Stationary Points in Nonconvex-Strongly-Concave Minimax Optimization

Abstract

We study the smooth minimax optimization problem minxmaxyf(x,y)\min_{\bf x}\max_{\bf y} f({\bf x},{\bf y}), where ff is \ell-smooth, strongly-concave in y{\bf y} but possibly nonconvex in x{\bf x}. Most of existing works focus on finding the first-order stationary points of the function f(x,y)f({\bf x},{\bf y}) or its primal function P(x)maxyf(x,y)P({\bf x})\triangleq \max_{\bf y} f({\bf x},{\bf y}), but few of them focus on achieving second-order stationary points. In this paper, we propose a novel approach for minimax optimization, called Minimax Cubic Newton (MCN), which could find an (ε,κ1.5ρε)\big(\varepsilon,\kappa^{1.5}\sqrt{\rho\varepsilon}\,\big)-second-order stationary point of P(x)P({\bf x}) with calling O(κ1.5ρε1.5){\mathcal O}\big(\kappa^{1.5}\sqrt{\rho}\varepsilon^{-1.5}\big) times of second-order oracles and O~(κ2ρε1.5)\tilde{\mathcal O}\big(\kappa^{2}\sqrt{\rho}\varepsilon^{-1.5}\big) times of first-order oracles, where κ\kappa is the condition number and ρ\rho is the Lipschitz continuous constant for the Hessian of f(x,y)f({\bf x},{\bf y}). In addition, we propose an inexact variant of MCN for high-dimensional problems to avoid calling expensive second-order oracles. Instead, our method solves the cubic sub-problem inexactly via gradient descent and matrix Chebyshev expansion. This strategy still obtains the desired approximate second-order stationary point with high probability but only requires O~(κ1.5ε2)\tilde{\mathcal O}\big(\kappa^{1.5}\ell\varepsilon^{-2}\big) Hessian-vector oracle calls and O~(κ2ρε1.5)\tilde{\mathcal O}\big(\kappa^{2}\sqrt{\rho}\varepsilon^{-1.5}\big) first-order oracle calls. To the best of our knowledge, this is the first work that considers the non-asymptotic convergence behavior of finding second-order stationary points for minimax problems without the convex-concave assumptions.

View on arXiv
Comments on this paper