Non-monotonic Value Function Factorization for Deep Multi-Agent
Reinforcement Learning
- OffRL
Abstract
In this paper, we propose actor-critic approaches by introducing an actor policy on QMIX [9], which can remove the monotonicity constraint of QMIX and implement a non-monotonic value function factorization for joint action-value. We evaluate our actor-critic methods on StarCraft II micromanagement tasks, and show that it has a stronger performance on maps with heterogeneous agent types.
View on arXivComments on this paper
