AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax
Optimization
- ODL
In the paper, we propose a class of faster adaptive Gradient Descent Ascent (GDA) methods for solving the nonconvex-strongly-concave minimax problems based on unified adaptive matrices, which include almost existing coordinate-wise and global adaptive learning rates. Specifically, we propose a fast Adaptive Gradient Decent Ascent (AdaGDA) method based on the basic momentum technique, which reaches a lower sample complexity of for finding an -stationary point without large batches, which improves the results of the existing adaptive GDA methods by a factor of . At the same time, we present an accelerated version of AdaGDA (VR-AdaGDA) method based on the momentum-based variance reduced technique, which achieves a lower sample complexity of for finding an -stationary point without large batches, which improves the results of the existing adaptive GDA methods by a factor of . Moreover, we prove that our VR-AdaGDA method reaches the best known sample complexity of with the mini-batch size . In particular, we provide an effective convergence analysis framework for our adaptive GDA methods. Some experimental results on fair classifier and policy evaluation tasks demonstrate the efficiency of our algorithms.
View on arXiv