v1v2 (latest)

Constant-Memory Strategies in Stochastic Games: Best Responses and Equilibria

11 May 2025

Fengming Zhu

Fangzhen Lin

ArXiv (abs)PDF HTML Github

Main:8 Pages

9 Figures

Bibliography:2 Pages

2 Tables

Appendix:11 Pages

Abstract

Stochastic games have become a prevalent framework for studying long-term multi-agent interactions, especially in the context of multi-agent reinforcement learning. In this work, we comprehensively investigate the concept of constant-memory strategies in stochastic games. We first establish some results on best responses and Nash equilibria for behavioral constant-memory strategies, followed by a discussion on the computational hardness of best responding to mixed constant-memory strategies. Those theoretic insights are later verified on several sequential decision-making testbeds, including the $\textit{Iterated Prisoner's Dilemma}$ , the $\textit{Iterated Traveler's Dilemma}$ , and the $\textit{Pursuit}$ domain. This work aims to enhance the understanding of theoretical issues in single-agent planning under multi-agent systems, and uncover the connection between decision models in single-agent and multi-agent contexts. The code is available at $\texttt{this https URL.}$

View on arXiv

Comments on this paper