Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs

8 May 2025

Abstract

Game-theoretic resource allocation on graphs (GRAG) involves two players competing over multiple steps to control nodes of interest on a graph, a problem modeled as a multi-step Colonel Blotto Game (MCBG). Finding optimal strategies is challenging due to the dynamic action space and structural constraints imposed by the graph. To address this, we formulate the MCBG as a Markov Decision Process (MDP) and apply Reinforcement Learning (RL) methods, specifically Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). To enforce graph constraints, we introduce an action-displacement adjacency matrix that dynamically generates valid action sets at each step. We evaluate RL performance across a variety of graph structures and initial resource distributions, comparing against random, greedy, and learned RL policies. Experimental results show that both DQN and PPO consistently outperform baseline strategies and converge to a balanced $50\%$ win rate when competing against the learned RL policy. Particularly, on asymmetric graphs, RL agents successfully exploit structural advantages and adapt their allocation strategies, even under disadvantageous initial resource distributions.

View on arXiv

@article{an2025_2505.06319,
  title={ Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs },
  author={ Zijian An and Lifeng Zhou },
  journal={arXiv preprint arXiv:2505.06319},
  year={ 2025 }
}

Comments on this paper