Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.13301
Cited By
v1
v2
v3
v4 (latest)
Training Diffusion Models with Reinforcement Learning
International Conference on Learning Representations (ICLR), 2023
22 May 2023
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Training Diffusion Models with Reinforcement Learning"
50 / 270 papers shown
Reinforcement Learning for Large Model: A Survey
Weijia Wu
Chen Gao
Joya Chen
Kevin Lin
Qingwei Meng
Yiming Zhang
Yuke Qiu
Hong Zhou
Mike Zheng Shou
317
2
0
24 Dec 2025
Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function
Hyeongyu Kang
Jaewoo Lee
Woocheol Shin
Kiyoung Om
Jinkyoo Park
101
0
0
04 Dec 2025
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
Bowen Ping
Chengyou Jia
Minnan Luo
Changliang Xia
Xin Shen
Zhuohang Dang
Hangwei Qian
EGVM
70
0
0
02 Dec 2025
Multi-GRPO: Multi-Group Advantage Estimation for Text-to-Image Generation with Tree-Based Trajectories and Multiple Rewards
Qiang Lyu
Z. Chen
C. Wang
Haolin Shi
Shibo Gao
...
Jianlou Si
Fei Ding
Jing Li
Chun Pong Lau
Weiqiang Wang
EGVM
128
1
0
30 Nov 2025
Ar2Can: An Architect and an Artist Leveraging a Canvas for Multi-Human Generation
Shubhankar Borse
Phuc Pham
Farzad Farhadzadeh
Seokeon Choi
P. Nguyen
Anh Tran
Sungrack Yun
Munawar Hayat
Fatih Porikli
78
0
0
27 Nov 2025
Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage
Peiyu Yu
Suraj Kothawade
Sirui Xie
Ying Nian Wu
Hongliang Fei
114
0
0
27 Nov 2025
SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition
Peiran Xu
Sudong Wang
Yao Zhu
Jianing Li
Yunjian Zhang
LRM
342
1
0
26 Nov 2025
Test-Time Alignment of Text-to-Image Diffusion Models via Null-Text Embedding Optimisation
Taehoon Kim
Henry Gouk
Timothy M. Hospedales
198
0
0
25 Nov 2025
HiCoGen: Hierarchical Compositional Text-to-Image Generation in Diffusion Models via Reinforcement Learning
Hongji Yang
Yucheng Zhou
Wencheng Han
Runzhou Tao
Zhongying Qiu
Jianfei Yang
Jianbing Shen
DiffM
EGVM
349
0
0
25 Nov 2025
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation
Weijia Mao
Hao Chen
Zhenheng Yang
Mike Zheng Shou
EGVM
273
0
0
25 Nov 2025
Seeing What Matters: Visual Preference Policy Optimization for Visual Generation
Ziqi Ni
Yuanzhi Liang
Rui Li
Yi Zhou
H. Huang
Chi Zhang
Xuelong Li
117
0
0
24 Nov 2025
ProxT2I: Efficient Reward-Guided Text-to-Image Generation via Proximal Diffusion
Zhenghan Fang
Jian Zheng
Qiaozi Gao
Xiaofeng Gao
Jeremias Sulam
213
0
0
24 Nov 2025
Synthetic Curriculum Reinforces Compositional Text-to-Image Generation
Shijian Wang
Runhao Fu
Siyi Zhao
Qingqin Zhan
Xingjian Wang
Jiarui Jin
Yuan Lu
Hanqian Wu
Cunjian Chen
EGVM
226
0
0
23 Nov 2025
SceneDesigner: Controllable Multi-Object Image Generation with 9-DoF Pose Manipulation
Zhenyuan Qin
Xincheng Shuai
Henghui Ding
DiffM
197
1
0
20 Nov 2025
BD-Net: Has Depth-Wise Convolution Ever Been Applied in Binary Neural Networks?
DoYoung Kim
Jin-Seop Lee
Noo-Ri Kim
SungJoon Lee
Jee-Hyong Lee
MQ
153
3
0
19 Nov 2025
Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning
Yuxuan Gu
Weimin Bai
Yifei Wang
Weijian Luo
H. Sun
DiffM
OffRL
247
0
0
19 Nov 2025
Distribution Matching Distillation Meets Reinforcement Learning
Dengyang Jiang
Dongyang Liu
Zanyi Wang
Qilong Wu
Liuzhuozheng Li
...
Bo Zhang
Mengmeng Wang
Steven Hoi
Peng Gao
H. Yang
407
0
0
17 Nov 2025
Generative AI Meets 6G and Beyond: Diffusion Models for Semantic Communications
Hai-Long Qin
Jincheng Dai
Guo Lu
Shuo Shao
Sixian Wang
Tongda Xu
Wenjun Zhang
Ping Zhang
Khaled B. Letaief
DiffM
VLM
417
0
0
11 Nov 2025
PC-Diffusion: Aligning Diffusion Models with Human Preferences via Preference Classifier
S. Wang
He Wang
X. Wei
Longquan Dai
Jinhui Tang
203
0
0
11 Nov 2025
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
Xinyan Cai
Shiguang Wu
Dafeng Chi
Yuzheng Zhuang
Xingyue Quan
Jianye Hao
Qiang Guan
100
0
0
03 Nov 2025
Reg-DPO: SFT-Regularized Direct Preference Optimization with GT-Pair for Improving Video Generation
Jie Du
Xinyu Gong
Qingshan Tan
W. Li
Yangming Cheng
Weitao Wang
Chenlu Zhan
Suhui Wu
H. Zhang
J. Zhang
VGen
365
0
0
03 Nov 2025
MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency
Nicolas Dufour
Lucas Degeorge
Arijit Ghosh
Vicky Kalogeiton
David Picard
EGVM
376
1
0
29 Oct 2025
Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models
Byeonghu Na
Minsang Park
Gyuwon Sim
DongHyeok Shin
Heesun Bae
Mina Kang
Se Jung Kwon
Wanmo Kang
Il-Chul Moon
230
1
0
28 Oct 2025
GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping
Jing Wang
Jiajun Liang
Jie Liu
Henglin Liu
Gongye Liu
...
Zhenyu Xie
Xintao Wang
Meng Wang
Pengfei Wan
Xiaodan Liang
161
1
0
25 Oct 2025
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation
Yifu Luo
Penghui Du
Bo Li
Sinan Du
Tiantian Zhang
Yongzhe Chang
Kai Wu
Kun Gai
Xueqian Wang
151
5
0
24 Oct 2025
StableSketcher: Enhancing Diffusion Model for Pixel-based Sketch Generation via Visual Question Answering Feedback
Jiho Park
Sieun Choi
Jaeyoon Seo
Jihie Kim
DiffM
124
0
0
23 Oct 2025
From Competition to Synergy: Unlocking Reinforcement Learning for Subject-Driven Image Generation
Ziwei Huang
Ying Shu
Hao Fang
Quanyu Long
Wenya Wang
Qiushi Guo
Tiezheng Ge
Yaoyao Yu
EGVM
190
0
0
21 Oct 2025
Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models
Jiajun Fan
Tong Wei
Chaoran Cheng
Yuxin Chen
Ge Liu
100
1
0
20 Oct 2025
UniRL-Zero: Reinforcement Learning on Unified Models with Joint Language Model and Diffusion Model Experts
Fu-Yun Wang
Han Zhang
Michael Gharbi
Hongsheng Li
Taesung Park
150
0
0
20 Oct 2025
Fine-tuning Flow Matching Generative Models with Intermediate Feedback
Jiajun Fan
Chaoran Cheng
Shuaike Shen
Xiangxin Zhou
Ge Liu
EGVM
161
1
0
20 Oct 2025
Soft-Masked Diffusion Language Models
Michael Hersche
Samuel Moor-Smith
Thomas Hofmann
Abbas Rahimi
314
1
0
20 Oct 2025
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback
Zongjian Li
Zheyuan Liu
Qihui Zhang
Bin Lin
Feize Wu
...
Wangbo Yu
Yuwei Niu
Shaodong Wang
Xinhua Cheng
Li Yuan
400
13
0
19 Oct 2025
Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
Mingyang Sun
Pengxiang Ding
Weinan Zhang
Donglin Wang
186
0
0
17 Oct 2025
DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
Yu Zhou
Sohyun An
Haikang Deng
Da Yin
Clark Peng
Cho-Jui Hsieh
Kai-Wei Chang
Nanyun Peng
VLM
147
1
0
16 Oct 2025
RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning
Kun Lei
Huanyu Li
Dongjie Yu
Zhenyu Wei
Lingxiao Guo
Zhennan Jiang
Ziyu Wang
Shiyu Liang
Huazhe Xu
OffRL
VLM
354
5
0
16 Oct 2025
RealDPO: Real or Not Real, that is the Preference
Guo Cheng
Danni Yang
Ziqi Huang
Jianlou Si
Chenyang Si
Ziwei Liu
VGen
317
2
0
16 Oct 2025
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
Meiqi Wu
Jiashu Zhu
Xiaokun Feng
C. L. Philip Chen
Chen Zhu
Bingze Song
Fangyuan Mao
Jiahong Wu
Xiangxiang Chu
Kaiqi Huang
VGen
EGVM
VLM
360
1
0
16 Oct 2025
Learning an Image Editing Model without Image Editing Pairs
Nupur Kumari
Sheng-Yu Wang
Nanxuan Zhao
Yotam Nitzan
Yuheng Li
Krishna Kumar Singh
Richard Zhang
Eli Shechtman
Jun-Yan Zhu
Xun Huang
DiffM
309
3
0
16 Oct 2025
A Black-Box Debiasing Framework for Conditional Sampling
Han Cui
Jingbo Liu
82
0
0
13 Oct 2025
Understanding Sampler Stochasticity in Training Diffusion Models for RLHF
Jiayuan Sheng
Hanyang Zhao
Haoxian Chen
David Yao
Wenpin Tang
142
0
0
12 Oct 2025
Calibrating Generative Models to Distributional Constraints
Henry D. Smith
Nathaniel L. Diamant
Brian L. Trippe
155
0
0
11 Oct 2025
GTAlign: Game-Theoretic Alignment of LLM Assistants for Social Welfare
Siqi Zhu
David Zhang
Pedro Cisneros-Velarde
J. You
LRM
210
0
0
10 Oct 2025
Computationally-efficient Graph Modeling with Refined Graph Random Features
K. Choromanski
Avinava Dubey
Arijit Sehanobish
Isaac Reid
116
0
0
09 Oct 2025
Reinforcing Diffusion Models by Direct Group Preference Optimization
Yihong Luo
Tianyang Hu
Jing Tang
145
1
0
09 Oct 2025
Deterministic algorithms for inhomogeneous Bernoulli trials: Shapley value of network devices
Jesse D Wei
Guo Wei
FAtt
226
0
0
08 Oct 2025
No MoCap Needed: Post-Training Motion Diffusion Models with Reinforcement Learning using Only Textual Prompts
Girolamo Macaluso
Lorenzo Mandelli
Mirko Bicchierai
Stefano Berretti
Andrew D. Bagdanov
VGen
130
0
0
08 Oct 2025
Asynchronous Denoising Diffusion Models for Aligning Text-to-Image Generation
Zijing Hu
Yunze Tong
Fengda Zhang
Junkun Yuan
Jun Xiao
Kun Kuang
DiffM
188
1
0
06 Oct 2025
Principled and Tractable RL for Reasoning with Diffusion Language Models
Anthony Zhan
DiffM
AI4CE
111
2
0
05 Oct 2025
Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL
Ruitao Wu
Yifan Zhao
Guangyao Chen
Jia Li
230
0
0
04 Oct 2025
D2 Actor Critic: Diffusion Actor Meets Distributional Critic
Lunjun Zhang
Shuo Han
Hanrui Lyu
Bradly C. Stadie
OffRL
265
1
0
03 Oct 2025
1
2
3
4
5
6
Next