Offline reinforcement learning (RL) enables policy optimization in static datasets, avoiding the risks and costs of real-world exploration. However, it struggles with suboptimal behavior learning and inaccurate value estimation due to the lack of environmental interaction. In this paper, we present Video-Enhanced Offline RL (VeoRL), a model-based approach that constructs an interactive world model from diverse, unlabeled video data readily available online. Leveraging model-based behavior guidance, VeoRL transfers commonsense knowledge of control policy and physical dynamics from natural videos to the RL agent within the target domain. Our method achieves substantial performance gains (exceeding 100% in some cases) across visuomotor control tasks in robotic manipulation, autonomous driving, and open-world video games.
View on arXiv@article{pan2025_2505.06482, title={ Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach }, author={ Minting Pan and Yitao Zheng and Jiajian Li and Yunbo Wang and Xiaokang Yang }, journal={arXiv preprint arXiv:2505.06482}, year={ 2025 } }