Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.13301
Cited By
v1
v2
v3
v4 (latest)
Training Diffusion Models with Reinforcement Learning
International Conference on Learning Representations (ICLR), 2023
22 May 2023
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Training Diffusion Models with Reinforcement Learning"
50 / 270 papers shown
Fine-Grained GRPO for Precise Preference Alignment in Flow Models
Yujie Zhou
Pengyang Ling
Jiazi Bu
Yibin Wang
Yuhang Zang
Jiaqi Wang
Li Niu
Guangtao Zhai
DiffM
225
3
0
02 Oct 2025
Plug-and-Play Prompt Refinement via Latent Feedback for Diffusion Model Alignment
Suhyeon Lee
Jong Chul Ye
167
0
0
01 Oct 2025
DisCo: Reinforcement with Diversity Constraints for Multi-Human Generation
Shubhankar Borse
Farzad Farhadzadeh
Munawar Hayat
Fatih Porikli
EGVM
257
2
0
01 Oct 2025
PCPO: Proportionate Credit Policy Optimization for Aligning Image Generation Models
J. Lee
Jong Chul Ye
111
0
0
30 Sep 2025
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
Jiayi Guo
Chuanhao Yan
Xingqian Xu
Yulin Wang
Kai Wang
Gao Huang
Humphrey Shi
143
1
0
30 Sep 2025
Training-Free Reward-Guided Image Editing via Trajectory Optimal Control
J. Chang
Jaemin Kim
Jong Chul Ye
183
0
0
30 Sep 2025
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models
Shenxu Chang
Junchi Yu
Weixing Wang
Yongqiang Chen
Jialin Yu
Philip Torr
Jindong Gu
HILM
156
0
0
30 Sep 2025
Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models
Shuchen Xue
Chongjian Ge
Shilong Zhang
Yichen Li
Zhi-Ming Ma
142
3
0
29 Sep 2025
Enhancing Blind Face Restoration through Online Reinforcement Learning
Bin Wu
Yahui Liu
Chi Zhang
Yao-Min Zhao
Wei Wang
CVBM
OffRL
CLL
OnRL
432
0
0
27 Sep 2025
Follow-Your-Preference: Towards Preference-Aligned Image Inpainting
Yutao Shen
Junkun Yuan
Toru Aonishi
Hideki Nakayama
Yue Ma
EGVM
192
3
0
27 Sep 2025
RAPID^3: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer
Wangbo Zhao
Yizeng Han
Zhiwei Tang
Jiasheng Tang
Pengfei Zhou
Kai Wang
Bohan Zhuang
Zinan Lin
Fan Wang
Yang You
184
1
0
26 Sep 2025
MultiCrafter: High-Fidelity Multi-Subject Generation via Disentangled Attention and Identity-Aware Preference Alignment
Tao Wu
Yibo Jiang
Yehao Lu
Zhizhong Wang
Longxiang Zhang
Zequn Qin
Xi Li
212
1
0
26 Sep 2025
d2: Improved Techniques for Training Reasoning Diffusion Language Models
Guanghan Wang
Yair Schiff
Gilad Turok
Volodymyr Kuleshov
DiffM
OffRL
LRM
192
6
0
25 Sep 2025
DriftLite: Lightweight Drift Control for Inference-Time Scaling of Diffusion Models
Yinuo Ren
Wenhao Gao
Lexing Ying
Grant M. Rotskoff
Jiequn Han
194
4
0
25 Sep 2025
PIRF: Physics-Informed Reward Fine-Tuning for Diffusion Models
Mingze Yuan
Pengfei Jin
Na Li
Shijie Zhao
AI4CE
141
0
0
24 Sep 2025
ComposableNav: Instruction-Following Navigation in Dynamic Environments via Composable Diffusion
Zichao Hu
Chen Tang
M. Munje
Yifeng Zhu
Alex Liu
Shuijing Liu
Garrett A. Warnell
Peter Stone
Joydeep Biswas
146
1
0
22 Sep 2025
RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
Tianyi Yan
Wencheng Han
Xia Zhou
Xueyang Zhang
Kun Zhan
Cheng-Zhong Xu
Jianbing Shen
EGVM
VGen
282
4
0
20 Sep 2025
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
Kaiwen Zheng
Huayu Chen
Haotian Ye
Haoxiang Wang
Qinsheng Zhang
Kai Jiang
Hang Su
Stefano Ermon
Jun Zhu
Ming-Yu Liu
241
14
0
19 Sep 2025
Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search
Zhiyu Mou
Yiqin Lv
Miao Xu
Cheems Wang
Yixiu Mao
...
Chao Li
Rongquan Bai
Chuan Yu
Jian Xu
Bo Zheng
OffRL
213
0
0
19 Sep 2025
What Makes a Good Generated Image? Investigating Human and Multimodal LLM Image Preference Alignment
Rishab Parthasarathy
Jasmine Collins
Cory Stephenson
EGVM
187
0
0
16 Sep 2025
RewardDance: Reward Scaling in Visual Generation
Jie Wu
Yu Gao
Zilyu Ye
Ming Li
Liang Li
...
Zeyue Xue
Xiaoxia Hou
Wei Liu
Yan Zeng
Weilin Huang
EGVM
218
20
0
10 Sep 2025
UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward
Yufeng Cheng
Wenxu Wu
Shaojin Wu
Mengqi Huang
Fei Ding
Qian He
111
6
0
08 Sep 2025
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
Xiangwei Shen
Zhimin Li
Zhantao Yang
Shiyi Zhang
Yingfang Zhang
Donghao Li
Chunyu Wang
Qinglin Lu
Yansong Tang
317
15
0
08 Sep 2025
Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching
Feng Wang
Zihao Yu
DiffM
256
12
0
07 Sep 2025
Moment- and Power-Spectrum-Based Gaussianity Regularization for Text-to-Image Models
Jisung Hwang
Jaihoon Kim
Minhyuk Sung
137
0
0
07 Sep 2025
BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models
Yuming Li
Y. Wang
Yuying Zhu
Zhongyu Zhao
Ming Lu
Qi She
Shanghang Zhang
281
17
0
07 Sep 2025
Diffusion Generative Models Meet Compressed Sensing, with Applications to Imaging and Finance
Zhengyi Guo
Jiatu Li
Wenpin Tang
D. Yao
DiffM
MedIm
237
0
0
04 Sep 2025
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model
Hongyang Wei
Baixin Xu
Hongbo Liu
Cyrus Wu
J. Liu
...
Ying He
Yang Liu
Xuchen Song
Eric Li
Y. Zhou
182
13
0
04 Sep 2025
MEPG:Multi-Expert Planning and Generation for Compositionally-Rich Image Generation
Yuan Zhao
Lin Liu
DiffM
MoE
206
0
0
04 Sep 2025
Connections between reinforcement learning with feedback,test-time scaling, and diffusion guidance: An anthology
Yuchen Jiao
Yuxin Chen
Gen Li
OffRL
129
0
0
04 Sep 2025
Relative Trajectory Balance is equivalent to Trust-PCL
T. Deleu
Padideh Nouri
Yoshua Bengio
Doina Precup
OffRL
157
1
0
01 Sep 2025
FocusDPO: Dynamic Preference Optimization for Multi-Subject Personalized Image Generation via Adaptive Focus
Qiaoqiao Jin
Siming Fu
D. She
Weinan Jia
Hualiang Wang
Mu Liu
Jidong Jiang
144
0
0
01 Sep 2025
The Mind's Eye: A Multi-Faceted Reward Framework for Guiding Visual Metaphor Generation
Girish A. Koushik
Fatemeh Nazarieh
Katherine Birch
Shenbin Qian
Diptesh Kanojia
EGVM
118
0
0
26 Aug 2025
Composition and Alignment of Diffusion Models using Constrained Learning
Shervin Khalafi
Ignacio Hounie
Dongsheng Ding
Alejandro Ribeiro
163
2
0
26 Aug 2025
Constraints-Guided Diffusion Reasoner for Neuro-Symbolic Learning
Xuan Zhang
Zhijian Zhou
Weidi Xu
Yanting Miao
Chao Qu
Yuan Qi
NAI
176
0
0
22 Aug 2025
Guiding Diffusion Models with Reinforcement Learning for Stable Molecule Generation
Zhijian Zhou
Junyi An
Zongkai Liu
Yunfei Shi
Xuan Zhang
Fenglei Cao
Chao Qu
Yuan Qi
210
0
0
22 Aug 2025
Cognitive Structure Generation: From Educational Priors to Policy Optimization
Hengnian Gu
Zhifu Chen
Yuxin Chen
Jin Peng Zhou
Dongdai Zhou
DiffM
155
0
0
18 Aug 2025
MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models
Haoyu He
Katrin Renz
Yong Cao
Andreas Geiger
DiffM
179
5
0
18 Aug 2025
Integrating Reinforcement Learning with Visual Generative Models: Foundations and Advances
Yuanzhi Liang
Yijie Fang
Rui Li
Ziqi Ni
Ruijie Su
Chi Zhang
Xuelong Li
EGVM
315
2
0
14 Aug 2025
Object Fidelity Diffusion for Remote Sensing Image Generation
Ziqi Ye
Shuran Ma
Jie Yang
Xiaoyi Yang
Ziyang Gong
Xue Yang
Haipeng Wang
DiffM
211
1
0
14 Aug 2025
TempFlow-GRPO: When Timing Matters for GRPO in Flow Models
Xiaoxuan He
Siming Fu
Yuke Zhao
W. Li
Zhiqiang Wang
Dacheng Yin
Fengyun Rao
Bo Zhang
AI4CE
342
25
0
06 Aug 2025
ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning
Debamita Ghosh
George Atia
Yue Wang
OffRL
OOD
302
3
0
05 Aug 2025
Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation
Shuo Lu
Yanyin Chen
Wei Feng
Jiahao Fan
F. Li
Zheng Zhang
Jingjing Lv
Junjie Shen
Ching Law
Jian Liang
OffRL
149
7
0
04 Aug 2025
The Promise of RL for Autoregressive Image Editing
Saba Ahmadi
Rabiul Awal
Ankur Sikarwar
Amirhossein Kazemnejad
Ge Ya Luo
...
Sai Rajeswar
Siva Reddy
C. Pal
Benno Krojer
Aishwarya Agrawal
OffRL
KELM
271
2
0
01 Aug 2025
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
Junzhe Li
Yutao Cui
Tao Huang
Yinping Ma
Chun-Kai Fan
Miles Yang
Zhao Zhong
266
47
0
29 Jul 2025
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again
Zigang Geng
Y. Wang
Yeyao Ma
Chen Li
Yongming Rao
...
Han Hu
Xiaosong Zhang
Linus
Di Wang
Jie Jiang
177
30
0
29 Jul 2025
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou
Ruiyi Zhang
Huaisheng Zhu
Branislav Kveton
Jiuxiang Gu
J. Gu
Jian Chen
Changyou Chen
MLLM
VLM
LRM
372
6
0
28 Jul 2025
Flow Matching Policy Gradients
David McAllister
Songwei Ge
Brent Yi
Chung Min Kim
Ethan Weber
Hongsuk Choi
Haiwen Feng
Angjoo Kanazawa
265
15
0
28 Jul 2025
TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation
Zhekai Chen
Ruihang Chu
Yukang Chen
Shiwei Zhang
Yujie Wei
Yingya Zhang
Xihui Liu
260
8
0
24 Jul 2025
Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models
Zejian Li
Yize Li
Chenye Meng
Zhongni Liu
Yang Ling
Shengyuan Zhang
Guang Yang
Changyuan Yang
Zhiyuan Yang
Lingyun Sun
371
5
0
14 Jul 2025
Previous
1
2
3
4
5
6
Next