51
0

Multi-agent Markov Entanglement

Main:26 Pages
Bibliography:3 Pages
Appendix:19 Pages
Abstract

Value decomposition has long been a fundamental technique in multi-agent dynamic programming and reinforcement learning (RL). Specifically, the value function of a global state (s1,s2,,sN)(s_1,s_2,\ldots,s_N) is often approximated as the sum of local functions: V(s1,s2,,sN)i=1NVi(si)V(s_1,s_2,\ldots,s_N)\approx\sum_{i=1}^N V_i(s_i). This approach traces back to the index policy in restless multi-armed bandit problems and has found various applications in modern RL systems. However, the theoretical justification for why this decomposition works so effectively remains underexplored.

View on arXiv
@article{chen2025_2506.02385,
  title={ Multi-agent Markov Entanglement },
  author={ Shuze Chen and Tianyi Peng },
  journal={arXiv preprint arXiv:2506.02385},
  year={ 2025 }
}
Comments on this paper