
![]() Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate RewardComputer Vision and Pattern Recognition (CVPR), 2024 |
![]() Training Efficient Controllers via Analytic Policy GradientIEEE International Conference on Robotics and Automation (ICRA), 2022 |