Multi-agent Markov Entanglement

3 June 2025

Shuze Chen

Tianyi Peng

ArXiv (abs)PDF HTML

Main:26 Pages

Bibliography:3 Pages

Appendix:19 Pages

Abstract

Value decomposition has long been a fundamental technique in multi-agent dynamic programming and reinforcement learning (RL). Specifically, the value function of a global state $(s_1,s_2,\ldots,s_N)$ is often approximated as the sum of local functions: $V(s_1,s_2,\ldots,s_N)\approx\sum_{i=1}^N V_i(s_i)$ . This approach traces back to the index policy in restless multi-armed bandit problems and has found various applications in modern RL systems. However, the theoretical justification for why this decomposition works so effectively remains underexplored.

View on arXiv

@article{chen2025_2506.02385,
  title={ Multi-agent Markov Entanglement },
  author={ Shuze Chen and Tianyi Peng },
  journal={arXiv preprint arXiv:2506.02385},
  year={ 2025 }
}

Comments on this paper