26
3

An Efficient On-Policy Deep Learning Framework for Stochastic Optimal Control

Abstract

We present a novel on-policy algorithm for solving stochastic optimal control (SOC) problems. By leveraging the Girsanov theorem, our method directly computes on-policy gradients of the SOC objective without expensive backpropagation through stochastic differential equations or adjoint problem solutions. This approach significantly accelerates the optimization of neural network control policies while scaling efficiently to high-dimensional problems and long time horizons. We evaluate our method on classical SOC benchmarks as well as applications to sampling from unnormalized distributions via Schrödinger-Föllmer processes and fine-tuning pre-trained diffusion models. Experimental results demonstrate substantial improvements in both computational speed and memory efficiency compared to existing approaches.

View on arXiv
@article{hua2025_2410.05163,
  title={ An Efficient On-Policy Deep Learning Framework for Stochastic Optimal Control },
  author={ Mengjian Hua and Mathieu Laurière and Eric Vanden-Eijnden },
  journal={arXiv preprint arXiv:2410.05163},
  year={ 2025 }
}
Comments on this paper