Stability Enhancement in Reinforcement Learning via Adaptive Control Lyapunov Function

Reinforcement Learning (RL) has shown promise in control tasks but faces significant challenges in real-world applications, primarily due to the absence of safety guarantees during the learning process. Existing methods often struggle with ensuring safe exploration, leading to potential system failures and restricting applications primarily to simulated environments. Traditional approaches such as reward shaping and constrained policy optimization can fail to guarantee safety during initial learning stages, while model-based methods using Control Lyapunov Functions (CLFs) or Control Barrier Functions (CBFs) may hinder efficient exploration and performance. To address these limitations, this paper introduces Soft Actor-Critic with Control Lyapunov Function (SAC-CLF), a framework that enhances stability and safety through three key innovations: (1) a task-specific CLF design method for safe and optimal performance; (2) dynamic adjustment of constraints to maintain robustness under unmodeled dynamics; and (3) improved control input smoothness while ensuring safety. Experimental results on a classical nonlinear system and satellite attitude control demonstrate the effectiveness of SAC-CLF in overcoming the shortcomings of existing methods.
View on arXiv@article{chen2025_2504.19473, title={ Stability Enhancement in Reinforcement Learning via Adaptive Control Lyapunov Function }, author={ Donghe Chen and Han Wang and Lin Cheng and Shengping Gong }, journal={arXiv preprint arXiv:2504.19473}, year={ 2025 } }