338
v1v2 (latest)

Faster Game Solving via Asymmetry of Step Sizes

Main:6 Pages
13 Figures
Bibliography:3 Pages
6 Tables
Appendix:13 Pages
Abstract

Counterfactual Regret Minimization (CFR) algorithms are widely used to compute a Nash equilibrium (NE) in two-player zero-sum imperfect-information extensive-form games (IIGs). Among them, Predictive CFR+^+ (PCFR+^+) is particularly powerful, achieving an exceptionally fast empirical convergence rate via the prediction in manythis http URL, the empirical convergence rate of PCFR+^+ would significantly degrade if the prediction is inaccurate, leading to unstable performance on certain IIGs. To enhance the robustness of PCFR+^+, we propose Asymmetric PCFR+^+ (APCFR+^+), which employs an adaptive asymmetry of step sizes between the updates of implicit and explicit accumulated counterfactual regrets to mitigate the impact of the prediction inaccuracy on convergence. We present a theoretical analysis demonstrating why APCFR+^+ can enhance the robustness. To the best of our knowledge, we are the first to propose the asymmetry of step sizes, a simple yet novel technique that effectively improves the robustness of PCFR+^+. Then, to reduce the difficulty of implementing APCFR+^+ caused by the adaptive asymmetry, we propose a simplified version of APCFR+^+ called Simple APCFR+^+ (SAPCFR+^+), which uses a fixed asymmetry of step sizes to enable only a single-line modification compared to original PCFR+^+.Experimental results on five standard IIG benchmarks and two heads-up no-limit Texas Hold' em (HUNL) Subagems show that (i) both APCFR+^+ and SAPCFR+^+ outperform PCFR+^+ in most of the tested games, (ii) SAPCFR+^+ achieves a comparable empirical convergence rate with APCFR+^+,and (iii) our approach can be generalized to improve other CFR algorithms, e.g., Discount CFR (DCFR).

View on arXiv
Comments on this paper