We establish the first uncoupled learning algorithm that attains per-player regret in multi-player general-sum games, where is the number of players, is the number of actions available to each player, and is the number of repetitions of the game. Our results exponentially improve the dependence on compared to the regret attainable by Log-Regularized Lifted Optimistic FTRL [Far+22c], and also reduce the dependence on the number of iterations from to compared to Optimistic Hedge, the previously well-studied algorithm with regret [DFG21]. Our algorithm is obtained by combining the classic Optimistic Multiplicative Weights Update (OMWU) with an adaptive, non-monotonic learning rate that paces the learning process of the players, making them more cautious when their regret becomes too negative.
View on arXiv@article{soleymani2025_2503.24340, title={ Faster Rates for No-Regret Learning in General Games via Cautious Optimism }, author={ Ashkan Soleymani and Georgios Piliouras and Gabriele Farina }, journal={arXiv preprint arXiv:2503.24340}, year={ 2025 } }