Optimal Control of Nonlinear Systems with Unknown Dynamics
This paper presents a data-driven method for finding a closed-loop optimal controller, which minimizes a specified infinite-horizon cost function for systems with unknown dynamics given any arbitrary initial state. Suppose the closed-loop optimal controller can be parameterized by a given class of functions, hereafter referred to as the policy. The proposed method introduces a novel gradient estimation framework, which approximates the gradient of the cost function with respect to the policy parameters via integrating the Koopman operator with the classical concept of actor-critic. This enables the policy parameters to be tuned iteratively using gradient descent to achieve an optimal controller, leveraging the linearity of the Koopman operator. The convergence analysis of the proposed framework is provided. The effectiveness of the method is demonstrated through comparisons with a model-free reinforcement learning approach, and its control performance is further evaluated through simulations against model-based optimal control methods that solve the same optimal control problem utilizing the exact system dynamics.
View on arXiv