421

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

International Conference on Learning Representations (ICLR), 2023
Main:10 Pages
11 Figures
Bibliography:3 Pages
Appendix:15 Pages
Abstract

Algorithms based on regret matching, specifically regret matching+^+ (RM+^+), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties of regret-matching algorithms. Given the importance of last-iterate convergence for numerical optimization reasons and relevance as modeling real-word learning in games, in this paper, we study the last-iterate convergence properties of various popular variants of RM+^+. First, we show numerically that several practical variants such as simultaneous RM+^+, alternating RM+^+, and simultaneous predictive RM+^+, all lack last-iterate convergence guarantees even on a simple 3×33\times 3 game. We then prove that recent variants of these algorithms based on a smoothing technique do enjoy last-iterate convergence: we prove that extragradient RM+^{+} and smooth Predictive RM+^+ enjoy asymptotic last-iterate convergence (without a rate) and 1/t1/\sqrt{t} best-iterate convergence. Finally, we introduce restarted variants of these algorithms, and show that they enjoy linear-rate last-iterate convergence.

View on arXiv
Comments on this paper