Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

International Conference on Learning Representations (ICLR), 2023

1 November 2023

ArXiv (abs)PDF HTML Github

Main:10 Pages

11 Figures

Bibliography:3 Pages

Appendix:15 Pages

Abstract

Algorithms based on regret matching, specifically regret matching $^+$ (RM $^+$ ), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties of regret-matching algorithms. Given the importance of last-iterate convergence for numerical optimization reasons and relevance as modeling real-word learning in games, in this paper, we study the last-iterate convergence properties of various popular variants of RM $^+$ . First, we show numerically that several practical variants such as simultaneous RM $^+$ , alternating RM $^+$ , and simultaneous predictive RM $^+$ , all lack last-iterate convergence guarantees even on a simple $3\times 3$ game. We then prove that recent variants of these algorithms based on a smoothing technique do enjoy last-iterate convergence: we prove that extragradient RM $^{+}$ and smooth Predictive RM $^+$ enjoy asymptotic last-iterate convergence (without a rate) and $1/\sqrt{t}$ best-iterate convergence. Finally, we introduce restarted variants of these algorithms, and show that they enjoy linear-rate last-iterate convergence.

View on arXiv

Comments on this paper