Stochastic Games with Limited Public Memory

We study the memory resources required for near-optimal play in two-player zero-sum stochastic games with the long-run average payoff. Although optimal strategies may not exist in such games, near-optimal strategies always do.Mertens and Neyman (1981) proved that in any stochastic game, for any , there exist uniform -optimal memory-based strategies -- i.e., strategies that are -optimal in all sufficiently long -stage games -- that use at most memory states within the first stages. We improve this bound on the number of memory states by proving that in any stochastic game, for any , there exist uniform -optimal memory-based strategies that use at most memory states in the first stages. Moreover, we establish the existence of uniform -optimal memory-based strategies whose memory updating and action selection are time-independent and such that, with probability close to 1, for all , the number of memory states used up to stage is at most .This result cannot be extended to strategies with bounded public memory -- even if time-dependent memory updating and action selection are allowed. This impossibility is illustrated in the Big Match -- a well-known stochastic game where the stage payoffs to Player 1 are 0 or 1. Although for any , there exist strategies of Player 1 that guarantee a payoff {exceeding} in all sufficiently long -stage games, we show that any strategy of Player 1 that uses a finite public memory fails to guarantee a payoff greater than in any sufficiently long -stage game.
View on arXiv@article{hansen2025_2505.02623, title={ Stochastic Games with Limited Public Memory }, author={ Kristoffer Arnsfelt Hansen and Rasmus Ibsen-Jensen and Abraham Neyman }, journal={arXiv preprint arXiv:2505.02623}, year={ 2025 } }