394
v1v2 (latest)

Second-Order Min-Max Optimization with Lazy Hessians

International Conference on Learning Representations (ICLR), 2024
Main:9 Pages
2 Figures
Bibliography:5 Pages
1 Tables
Appendix:9 Pages
Abstract

This paper studies second-order methods for convex-concave minimax optimization. Monteiro and Svaiter (2012) proposed a method to solve the problem with an optimal iteration complexity of O(ϵ3/2)\mathcal{O}(\epsilon^{-3/2}) to find an ϵ\epsilon-saddle point. However, it is unclear whether the computational complexity, O((N+d2)dϵ2/3)\mathcal{O}((N+ d^2) d \epsilon^{-2/3}), can be improved. In the above, we follow Doikov et al. (2023) and assume the complexity of obtaining a first-order oracle as NN and the complexity of obtaining a second-order oracle as dNdN. In this paper, we show that the computation cost can be reduced by reusing Hessian across iterations. Our methods take the overall computational complexity of $ \tilde{\mathcal{O}}( (N+d^2)(d+ d^{2/3}\epsilon^{-2/3}))$, which improves those of previous methods by a factor of d1/3d^{1/3}. Furthermore, we generalize our method to strongly-convex-strongly-concave minimax problems and establish the complexity of O~((N+d2)(d+d2/3κ2/3))\tilde{\mathcal{O}}((N+d^2) (d + d^{2/3} \kappa^{2/3}) ) when the condition number of the problem is κ\kappa, enjoying a similar speedup upon the state-of-the-art method. Numerical experiments on both real and synthetic datasets also verify the efficiency of our method.

View on arXiv
Comments on this paper