69
5

Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification

Abstract

The fields of MPC and RL consider two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and different requirements. Various technical discrepancies, particularly the role of an environment model as part of the algorithm, lead to methodologies with nearly complementary advantages. Due to their orthogonal benefits, research interest in combination methods has recently increased significantly, leading to a large and growing set of complex ideas leveraging MPC and RL. This work illuminates the differences, similarities, and fundamentals that allow for different combination algorithms and categorizes existing work accordingly. Particularly, we focus on the versatile actor-critic RL approach as a basis for our categorization and examine how the online optimization approach of MPC can be used to improve the overall closed-loop performance of a policy.

View on arXiv
@article{reiter2025_2502.02133,
  title={ Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification },
  author={ Rudolf Reiter and Jasper Hoffmann and Dirk Reinhardt and Florian Messerer and Katrin Baumgärtner and Shamburaj Sawant and Joschka Boedecker and Moritz Diehl and Sebastien Gros },
  journal={arXiv preprint arXiv:2502.02133},
  year={ 2025 }
}
Comments on this paper