Learning to Drive from a World Model

Most self-driving systems rely on hand-coded perception outputs and engineered driving rules. Learning directly from human driving data with an end-to-end method can allow for a training architecture that is simpler and scales well with compute and data.In this work, we propose an end-to-end training architecture that uses real driving data to train a driving policy in an on-policy simulator. We show two different methods of simulation, one with reprojective simulation and one with a learned world model. We show that both methods can be used to train a policy that learns driving behavior without any hand-coded driving rules. We evaluate the performance of these policies in a closed-loop simulation and when deployed in a real-world advanced driver-assistance system.
View on arXiv@article{goff2025_2504.19077, title={ Learning to Drive from a World Model }, author={ Mitchell Goff and Greg Hogan and George Hotz and Armand du Parc Locmaria and Kacper Raczy and Harald Schäfer and Adeeb Shihadeh and Weixing Zhang and Yassine Yousfi }, journal={arXiv preprint arXiv:2504.19077}, year={ 2025 } }