ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13301
  4. Cited By
Training Diffusion Models with Reinforcement Learning
v1v2v3v4 (latest)

Training Diffusion Models with Reinforcement Learning

International Conference on Learning Representations (ICLR), 2023
22 May 2023
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
    EGVM
ArXiv (abs)PDFHTMLHuggingFace (4 upvotes)

Papers citing "Training Diffusion Models with Reinforcement Learning"

50 / 268 papers shown
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling
Meihua Dang
Jiaqi Han
Minkai Xu
Kai Xu
Akash Srivastava
Stefano Ermon
DiffM
111
7
0
11 Jul 2025
Divergence Minimization Preference Optimization for Diffusion Model Alignment
Divergence Minimization Preference Optimization for Diffusion Model Alignment
Binxu Li
Minkai Xu
Jiaqi Han
Meihua Dang
Stefano Ermon
EGVM
267
2
0
10 Jul 2025
Discrete Diffusion Trajectory Alignment via Stepwise Decomposition
Discrete Diffusion Trajectory Alignment via Stepwise Decomposition
Jiaqi Han
Austin Wang
Minkai Xu
Wenda Chu
Meihua Dang
Yisong Yue
Stefano Ermon
181
4
0
07 Jul 2025
Interactive Groupwise Comparison for Reinforcement Learning from Human Feedback
Interactive Groupwise Comparison for Reinforcement Learning from Human Feedback
Jan Kompatscher
Danqing Shi
Giovanna Varni
Tino Weinkauf
Antti Oulasvirta
VLM
169
1
0
06 Jul 2025
Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design
Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design
Xingyu Su
Xiner Li
Masatoshi Uehara
Sunwoo Kim
Yulai Zhao
Gabriele Scalia
Ehsan Hajiramezanali
Tommaso Biancalani
D. Zhi
Shuiwang Ji
182
5
0
01 Jul 2025
Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards
Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards
Qingming Liu
Zhen Liu
Dinghuai Zhang
Kui Jia
255
2
0
18 Jun 2025
Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models
Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models
Donghoon Ahn
Jiwon Kang
Sanghyun Lee
Minjae Kim
Jaewon Min
Wooseok Jang
Saungwu Lee
Sayak Paul
S. Hong
Seungryong Kim
DiffMAAML
470
0
0
12 Jun 2025
ReGuidance: A Simple Diffusion Wrapper for Boosting Sample Quality on Hard Inverse Problems
ReGuidance: A Simple Diffusion Wrapper for Boosting Sample Quality on Hard Inverse Problems
Aayush Karan
Kulin Shah
Sitan Chen
293
1
0
12 Jun 2025
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lv
Tianlin Pan
Chenyang Si
Zhaoxi Chen
W. Zuo
Yu Qiao
Kwan-Yee K. Wong
288
5
0
09 Jun 2025
AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization
AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization
Lanjiong Li
Guanhua Zhao
Lingting Zhu
Zeyu Cai
Ziqiang Li
Jian Zhang
Zeyu Wang
188
0
0
06 Jun 2025
When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
Quan Shi
Carlos E. Jimenez
Shunyu Yao
Nick Haber
Diyi Yang
Karthik Narasimhan
328
1
0
05 Jun 2025
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu
Vidit Goel
Ivan Skorokhodov
Willi Menapace
Ashkan Mirzaei
Igor Gilitschenski
Sergey Tulyakov
Aliaksandr Siarohin
DiffMVGen
389
11
0
04 Jun 2025
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Yunhong Lu
Qichao Wang
H. Cao
Xiaoyin Xu
Min Zhang
332
5
0
03 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
412
7
0
02 Jun 2025
Psi-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models
Psi-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models
Taehoon Yoon
Yunhong Min
Kyeongmin Yeo
Minhyuk Sung
358
0
0
02 Jun 2025
ADEPT: Adaptive Diffusion Environment for Policy Transfer Sim-to-Real
ADEPT: Adaptive Diffusion Environment for Policy Transfer Sim-to-Real
Youwei Yu
Junhong Xu
Lantao Liu
308
2
0
02 Jun 2025
Inference-Time Alignment of Diffusion Models via Evolutionary Algorithms
Inference-Time Alignment of Diffusion Models via Evolutionary Algorithms
Purvish Jajal
Nick Eliopoulos
Benjamin Shiue-Hal Chou
George K. Thiruvathukal
James C. Davis
Yung-Hsiang Lu
187
1
0
30 May 2025
Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering
Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering
Sixian Wang
Zhiwei Tang
Tsung-Hui Chang
DiffM
172
0
0
29 May 2025
A Survey of Generative Categories and Techniques in Multimodal Generative Models
A Survey of Generative Categories and Techniques in Multimodal Generative Models
Longzhen Han
Awes Mubarak
Almas Baimagambetov
Nikolaos Polatidis
Thar Baker
LRM
399
0
0
29 May 2025
Rhetorical Text-to-Image Generation via Two-layer Diffusion Policy Optimization
Rhetorical Text-to-Image Generation via Two-layer Diffusion Policy Optimization
Yuxi Zhang
Yueting Li
Xinyu Du
Sibo Wang
DiffMEGVM
239
0
0
28 May 2025
Inference-Time Scaling of Discrete Diffusion Models via Importance Weighting and Optimal Proposal Design
Inference-Time Scaling of Discrete Diffusion Models via Importance Weighting and Optimal Proposal Design
Chinmay Pani
Chinmay Pani
Yingzhen Li
DiffM
378
2
0
28 May 2025
SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training
SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training
Xiaomeng Yang
Zhiyu Tan
Junyan Wang
Zhijian Zhou
Hao Li
289
0
0
28 May 2025
Text2Stereo: Repurposing Stable Diffusion for Stereo Generation with Consistency Rewards
Text2Stereo: Repurposing Stable Diffusion for Stereo Generation with Consistency Rewards
Aakash Garg
Libing Zeng
Andrii Tsarov
N. Kalantari
263
0
0
27 May 2025
Decision Flow Policy Optimization
Decision Flow Policy Optimization
Jifeng Hu
Sili Huang
Siyuan Guo
Zhaogeng Liu
Li Shen
Lichao Sun
Hechang Chen
Yi-Ju Chang
Dacheng Tao
333
0
0
26 May 2025
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Fengqi Zhu
Rongzhen Wang
Shen Nie
Xiaolu Zhang
Chunwei Wu
...
Jun Zhou
Jianfei Chen
Yankai Lin
Ji-Rong Wen
Chongxuan Li
425
93
0
25 May 2025
Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning
Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning
Xinyao Liao
Wei Wei
Xiaoye Qu
Yu Cheng
EGVM
218
3
0
25 May 2025
Rethinking Direct Preference Optimization in Diffusion Models
Rethinking Direct Preference Optimization in Diffusion Models
Junyong Kang
Seohyun Lim
Kyungjune Baek
Hyunjung Shim
1.0K
0
0
24 May 2025
Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models
Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models
Min Cheng
Fatemeh Doudi
D. Kalathil
Mohammad Ghavamzadeh
P. R. Kumar
287
1
0
24 May 2025
InfLVG: Reinforce Inference-Time Consistent Long Video Generation with GRPO
Xueji Fang
Liyuan Ma
Zhiyang Chen
Mingyuan Zhou
Guo-Jun Qi
VGen
561
6
0
23 May 2025
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
Yanting Miao
William Loh
Suraj Kothawade
Pacal Poupart
267
0
0
23 May 2025
Scaling Image and Video Generation via Test-Time Evolutionary Search
Haoran He
Jiajun Liang
X. Wang
Pengfei Wan
Di Zhang
Kun Gai
Ling Pan
DiffM
399
8
0
23 May 2025
RLVR-World: Training World Models with Reinforcement Learning
RLVR-World: Training World Models with Reinforcement Learning
Jialong Wu
Shaofeng Yin
Ningya Feng
Mingsheng Long
OffRLVGen
498
16
0
20 May 2025
Minimum-Excess-Work Guidance
Minimum-Excess-Work Guidance
Christopher Kolloff
Tobias Höppe
Emmanouil Angelis
Mathias Jacob Schreiner
Stefan Bauer
Andrea Dittadi
Simon Olsson
OT
388
0
0
19 May 2025
Towards Self-Improvement of Diffusion Models via Group Preference Optimization
Towards Self-Improvement of Diffusion Models via Group Preference Optimization
Renjie Chen
Wenfeng Lin
Yichen Zhang
Jiangchuan Wei
Boyuan Liu
Chao Feng
Jiao Ran
Mingyu Guo
327
3
0
16 May 2025
CompAlign: Improving Compositional Text-to-Image Generation with a Complex Benchmark and Fine-Grained Feedback
CompAlign: Improving Compositional Text-to-Image Generation with a Complex Benchmark and Fine-Grained Feedback
Yixin Wan
Kai-Wei Chang
EGVMCoGe
287
3
0
16 May 2025
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2025
Fu-Yun Wang
Yunhao Shui
Jingtan Piao
Keqiang Sun
Jiaming Song
244
13
0
16 May 2025
DanceGRPO: Unleashing GRPO on Visual Generation
DanceGRPO: Unleashing GRPO on Visual Generation
Zeyue Xue
Jie Wu
Yu Gao
Fangyuan Kong
Lingting Zhu
...
Zhiheng Liu
Wei Liu
Qiushan Guo
Weilin Huang
Ping Luo
EGVMVGen
539
140
0
12 May 2025
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
Hongkun Dou
Zeyu Li
Xingyu Jiang
Haoyang Li
Lijun Yang
Wen Yao
Yue Deng
DiffM
529
0
0
12 May 2025
Flow-GRPO: Training Flow Matching Models via Online RL
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu
Gongye Liu
Jiajun Liang
Yongqian Li
Jiaheng Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Wanli Ouyang
AI4CE
820
173
0
08 May 2025
Convergence Of Consistency Model With Multistep Sampling Under General Data Assumptions
Convergence Of Consistency Model With Multistep Sampling Under General Data Assumptions
Yiding Chen
Yiyi Zhang
Owen Oertell
Wen Sun
DiffM
262
2
0
06 May 2025
DRAGON: Distributional Rewards Optimize Diffusion Generative Models
DRAGON: Distributional Rewards Optimize Diffusion Generative Models
Yatong Bai
Jonah Casebeer
Somayeh Sojoudi
Nicholas J. Bryan
DiffMVLM
481
2
0
21 Apr 2025
Design Topological Materials by Reinforcement Fine-Tuned Generative Model
Design Topological Materials by Reinforcement Fine-Tuned Generative Model
Haosheng Xu
Dongheng Qian
Zhixuan Liu
Yadong Jiang
Jing Wang
173
1
0
17 Apr 2025
Aligning Constraint Generation with Design Intent in Parametric CAD
Aligning Constraint Generation with Design Intent in Parametric CAD
Evan Casey
Tianyu Zhang
Shu Ishida
John Roger Thompson
John Roger Thompson
Amir Khasahmadi
Joseph George Lambourne
P. Jayaraman
K. Willis
322
2
0
17 Apr 2025
ADT: Tuning Diffusion Models with Adversarial Supervision
ADT: Tuning Diffusion Models with Adversarial Supervision
Dazhong Shen
Guanglu Song
Yuanxing Zhang
Bingqi Ma
Lujundong Li
Shihong Deng
Zhuofan Zong
Y. Liu
DiffM
347
3
0
15 Apr 2025
Aligning Anime Video Generation with Human Feedback
Aligning Anime Video Generation with Human Feedback
Bingwen Zhu
Yudong Jiang
Baohan Xu
Siqian Yang
Mingyu Yin
Yidi Wu
Huyang Sun
Zuxuan Wu
EGVMVGen
387
4
0
14 Apr 2025
F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
Xiaohui Sun
Ruitong Xiao
Jianye Mo
Bowen Wu
Qun Yu
Baoxun Wang
476
13
0
03 Apr 2025
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
Yunhong Min
Daehyeon Choi
Kyeongmin Yeo
Jihyun Lee
Minhyuk Sung
464
1
0
28 Mar 2025
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Ruining Li
Chuanxia Zheng
Christian Rupprecht
Andrea Vedaldi
350
10
0
28 Mar 2025
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim
Taehoon Yoon
Jisung Hwang
Minhyuk Sung
DiffM
478
19
0
25 Mar 2025
RL4Med-DDPO: Reinforcement Learning for Controlled Guidance Towards Diverse Medical Image Generation using Vision-Language Foundation Models
RL4Med-DDPO: Reinforcement Learning for Controlled Guidance Towards Diverse Medical Image Generation using Vision-Language Foundation ModelsInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Parham Saremi
Amar Kumar
Mohammed Mohammed
Zahra Tehraninasab
Tal Arbel
LM&MAMedIm
281
2
0
20 Mar 2025
Previous
123456
Next