29
0

On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games

Abstract

Non-ergodic convergence of learning dynamics in games is widely studied recently because of its importance in both theory and practice. Recent work (Cai et al., 2024) showed that a broad class of learning dynamics, including Optimistic Multiplicative Weights Update (OMWU), can exhibit arbitrarily slow last-iterate convergence even in simple 2×22 \times 2 matrix games, despite many of these dynamics being known to converge asymptotically in the last iterate. It remains unclear, however, whether these algorithms achieve fast non-ergodic convergence under weaker criteria, such as best-iterate convergence. We show that for 2×22\times 2 matrix games, OMWU achieves an O(T1/6)O(T^{-1/6}) best-iterate convergence rate, in stark contrast to its slow last-iterate convergence in the same class of games. Furthermore, we establish a lower bound showing that OMWU does not achieve any polynomial random-iterate convergence rate, measured by the expected duality gaps across all iterates. This result challenges the conventional wisdom that random-iterate convergence is essentially equivalent to best-iterate convergence, with the former often used as a proxy for establishing the latter. Our analysis uncovers a new connection to dynamic regret and presents a novel two-phase approach to best-iterate convergence, which could be of independent interest.

View on arXiv
@article{cai2025_2503.02825,
  title={ On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games },
  author={ Yang Cai and Gabriele Farina and Julien Grand-Clément and Christian Kroer and Chung-Wei Lee and Haipeng Luo and Weiqiang Zheng },
  journal={arXiv preprint arXiv:2503.02825},
  year={ 2025 }
}
Comments on this paper