ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.05894
32
0

Policy Optimization for Continuous-time Linear-Quadratic Graphon Mean Field Games

6 June 2025
Philipp Plank
Yufei Zhang
ArXiv (abs)PDFHTML
Main:40 Pages
5 Figures
Bibliography:5 Pages
Appendix:1 Pages
Abstract

Multi-agent reinforcement learning, despite its popularity and empirical success, faces significant scalability challenges in large-population dynamic games. Graphon mean field games (GMFGs) offer a principled framework for approximating such games while capturing heterogeneity among players. In this paper, we propose and analyze a policy optimization framework for continuous-time, finite-horizon linear-quadratic GMFGs. Exploiting the structural properties of GMFGs, we design an efficient policy parameterization in which each player's policy is represented as an affine function of their private state, with a shared slope function and player-specific intercepts. We develop a bilevel optimization algorithm that alternates between policy gradient updates for best-response computation under a fixed population distribution, and distribution updates using the resulting policies. We prove linear convergence of the policy gradient steps to best-response policies and establish global convergence of the overall algorithm to the Nash equilibrium. The analysis relies on novel landscape characterizations over infinite-dimensional policy spaces. Numerical experiments demonstrate the convergence and robustness of the proposed algorithm under varying graphon structures, noise levels, and action frequencies.

View on arXiv
@article{plank2025_2506.05894,
  title={ Policy Optimization for Continuous-time Linear-Quadratic Graphon Mean Field Games },
  author={ Philipp Plank and Yufei Zhang },
  journal={arXiv preprint arXiv:2506.05894},
  year={ 2025 }
}
Comments on this paper