507

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

Main:16 Pages
4 Figures
Bibliography:7 Pages
Appendix:21 Pages
Abstract

We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the form of minxmaxyYf(x,y)\min_\textbf{x} \max_{\textbf{y} \in Y} f(\textbf{x}, \textbf{y}), where the objective function f(x,y)f(\textbf{x}, \textbf{y}) is nonconvex in x\textbf{x} and concave in y\textbf{y}, and the constraint set YRnY \subseteq \mathbb{R}^n is convex and bounded. In the convex-concave setting, the single-timescale GDA achieves strong convergence guarantees and has been used for solving application problems arising from operations research and computer science. However, it can fail to converge in more general settings. Our contribution in this paper is to design the simple deterministic and stochastic TTGDA algorithms that efficiently find one stationary point of the function Φ():=maxyYf(,y)\Phi(\cdot) := \max_{\textbf{y} \in Y} f(\cdot, \textbf{y}). Specifically, we prove the theoretical bounds on the complexity of solving both smooth and nonsmooth nonconvex-concave minimax optimization problems. To our knowledge, this is the first systematic analysis of TTGDA for nonconvex minimax optimization, shedding light on its superior performance in training generative adversarial networks (GANs) and in solving other real-world application problems.

View on arXiv
Comments on this paper