Value Functions for Depth-Limited Solving in Imperfect-Information Games

Artificial Intelligence (AI), 2019

31 May 2019

Abstract

We provide a formal definition of depth-limited games together with an accessible and rigorous explanation of the underlying concepts, both of which were previously missing in imperfect-information games. The definition works for an arbitrary extensive-form game and is not tied to any specific game-solving algorithm. Moreover, this framework unifies and significantly extends three approaches to depth-limited solving that previously existed in extensive-form games and multiagent reinforcement learning but were not known to be compatible. A key ingredient of these depth-limited games are value functions. Focusing on two-player zero-sum imperfect-information games, we show how to obtain optimal value functions and prove that public information provides both necessary and sufficient context for computing them. We provide a domain-independent encoding of the domain which allows for approximating value functions even by simple feed-forward neural networks. We use the resulting value network to implement a depth-limited version of counterfactual regret minimization. In three distinct domains, we show that the algorithm produces a low-exploitability strategy if and only if it is paired with a near-optimal value network. We show that the value network is capable of generalizing to unseen game situations and that the resulting algorithm performs on par with CFR-D despite being trained on randomly-generated game situations.

View on arXiv

Comments on this paper