Faster Game Solving via Asymmetry of Step Sizes
Counterfactual Regret Minimization (CFR) algorithms are widely used to compute a Nash equilibrium (NE) in two-player zero-sum imperfect-information extensive-form games (IIGs). Among them, Predictive CFR (PCFR) is particularly powerful, achieving an exceptionally fast empirical convergence rate via the prediction in manythis http URL, the empirical convergence rate of PCFR would significantly degrade if the prediction is inaccurate, leading to unstable performance on certain IIGs. To enhance the robustness of PCFR, we propose Asymmetric PCFR (APCFR), which employs an adaptive asymmetry of step sizes between the updates of implicit and explicit accumulated counterfactual regrets to mitigate the impact of the prediction inaccuracy on convergence. We present a theoretical analysis demonstrating why APCFR can enhance the robustness. To the best of our knowledge, we are the first to propose the asymmetry of step sizes, a simple yet novel technique that effectively improves the robustness of PCFR. Then, to reduce the difficulty of implementing APCFR caused by the adaptive asymmetry, we propose a simplified version of APCFR called Simple APCFR (SAPCFR), which uses a fixed asymmetry of step sizes to enable only a single-line modification compared to original PCFR.Experimental results on five standard IIG benchmarks and two heads-up no-limit Texas Hold' em (HUNL) Subagems show that (i) both APCFR and SAPCFR outperform PCFR in most of the tested games, (ii) SAPCFR achieves a comparable empirical convergence rate with APCFR,and (iii) our approach can be generalized to improve other CFR algorithms, e.g., Discount CFR (DCFR).
View on arXiv