v1v2v3v4v5 (latest)

Heterogeneous Multi-Agent Proximal Policy Optimization for Power Distribution System Restoration

18 November 2025

Parya Dolatyabi

Ali Farajzadeh Bavil

Mahdi Khodayar

ArXiv (abs)PDF HTML Github

Main:5 Pages

4 Figures

Bibliography:1 Pages

Abstract

Restoring power distribution systems (PDSs) after large-scale outages requires sequential switching actions that reconfigure feeder topology and coordinate distributed energy resources (DERs) under nonlinear constraints, including power balance, voltage limits, and thermal ratings. These challenges limit the scalability of conventional optimization and value-based reinforcement learning (RL) approaches. This paper applies a Heterogeneous-Agent Reinforcement Learning (HARL) framework via Heterogeneous-Agent Proximal Policy Optimization (HAPPO) to enable coordinated restoration across interconnected microgrids. Each agent controls a distinct microgrid with different loads, DER capacities, and switch counts. Decentralized actors are trained with a centralized critic for stable on-policy learning, while a physics-informed OpenDSS environment enforces electrical feasibility. Experiments on IEEE 123-bus and 8500-node feeders show HAPPO outperforms PPO, QMIX, Mean-Field RL, and other baselines in restored power, convergence stability, and multi-seed reproducibility. Under a 2400 kW generation cap, the framework restores over 95\% of available load on both systems with low-latency execution, supporting practical real-time PDS restoration.

View on arXiv

Comments on this paper