Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2403.03206
Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (68 upvotes)
Papers citing
"Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"
50 / 1,247 papers shown
Video Generators are Robot Policies
Junbang Liang
P. Tokmakov
Ruoshi Liu
Sruthi Sudhakar
Paarth Shah
Rares Andrei Ambrus
Carl Vondrick
VGen
284
15
0
01 Aug 2025
ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation
Cihang Peng
Qiming Hou
Zhong Ren
Kun Zhou
ObjD
159
0
0
01 Aug 2025
PixNerd: Pixel Neural Field Diffusion
Shuai Wang
Ziteng Gao
Chenhui Zhu
Weilin Huang
Limin Wang
DiffM
220
16
0
31 Jul 2025
UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing
Hao Tang
Chenwei Xie
Xiaoyi Bao
Tingyu Weng
P. Li
Yun Zheng
Liwei Wang
234
10
0
31 Jul 2025
H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation
Hongzhe Bi
Lingxuan Wu
Tianwei Lin
Hengkai Tan
Zhizhong Su
Hang Su
Jun-Jie Zhu
197
13
0
31 Jul 2025
One-Step Flow Policy Mirror Descent
Tianyi Chen
Haitong Ma
Na Li
Kai Wang
Bo Dai
258
1
0
31 Jul 2025
DivControl: Knowledge Diversion for Controllable Image Generation
Yucheng Xie
Fu Feng
Ruixiao Shi
Jing Wang
Yong Rui
Xin Geng
174
1
0
31 Jul 2025
On the Reliability of Vision-Language Models Under Adversarial Frequency-Domain Perturbations
Jordan Vice
Naveed Akhtar
Yansong Gao
Richard Hartley
Ajmal Mian
AAML
210
2
0
30 Jul 2025
Enhancing Generalization in Data-free Quantization via Mixup-class Prompting
Jiwoong Park
Chaeun Lee
Yongseok Choi
Sein Park
Deokki Hong
Jungwook Choi
MQ
190
0
0
29 Jul 2025
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again
Zigang Geng
Y. Wang
Yeyao Ma
Chen Li
Yongming Rao
...
Han Hu
Xiaosong Zhang
Linus
Di Wang
Jie Jiang
177
30
0
29 Jul 2025
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
Junzhe Li
Yutao Cui
Tao Huang
Yinping Ma
Chun-Kai Fan
Miles Yang
Zhao Zhong
266
47
0
29 Jul 2025
Harnessing Diffusion-Yielded Score Priors for Image Restoration
Xinqi Lin
Fanghua Yu
Jinfan Hu
Zhiyuan You
Wu Shi
Jimmy S. J. Ren
Jinjin Gu
Chao Dong
DiffM
287
7
0
28 Jul 2025
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
Renhang Liu
Chia-Yu Hung
Navonil Majumder
Taylor Gautreaux
Amir Ali Bagherzadeh
Chuan Li
Dorien Herremans
Soujanya Poria
DiffM
182
4
0
28 Jul 2025
Investigation of Accuracy and Bias in Face Recognition Trained with Synthetic Data
Pavel Korshunov
Ketan Kotwal
Christophe Ecabert
Vidit Vidit
A. Mohammadi
Sébastien Marcel
176
2
0
28 Jul 2025
HDR Environment Map Estimation with Latent Diffusion Models
Jack Hilliard
Adrian Hilton
Jean-Yves Guillemaut
DiffM
143
0
0
28 Jul 2025
Fine-structure Preserved Real-world Image Super-resolution via Transfer VAE Training
Qiaosi Yi
Shuai Li
Rongyuan Wu
Lingchen Sun
Y. Wu
Lei Zhang
SupR
284
7
0
27 Jul 2025
ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking
X. Feng
Shuyan Hu
X. Li
D. Zhang
M. Wu
Jie Zhang
Xiaosha Chen
K. Huang
184
3
0
26 Jul 2025
LLMControl: Grounded Control of Text-to-Image Diffusion-based Synthesis with Multimodal LLMs
Jiaze Wang
Rui Chen
Haowang Cui
181
0
0
26 Jul 2025
Back to the Features: DINO as a Foundation for Video World Models
Federico Baldassarre
Marc Szafraniec
Basile Terver
Vasil Khalidov
Francisco Massa
Yann LeCun
Patrick Labatut
Maximilian Seitzer
Piotr Bojanowski
VGen
197
26
0
25 Jul 2025
RealisVSR: Detail-enhanced Diffusion for Real-World 4K Video Super-Resolution
Weisong Zhao
Jingkai Zhou
Xiangyu Zhu
Weihua Chen
Xiao-Yu Zhang
Zhen Lei
Fan Wang
VGen
147
1
0
25 Jul 2025
Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment
Ying Ba
Tianyu Zhang
Yalong Bai
Wenyi Mo
Tao Liang
Bing Su
Ji-Rong Wen
EGVM
241
6
0
25 Jul 2025
GS-Occ3D: Scaling Vision-only Occupancy Reconstruction with Gaussian Splatting
Baijun Ye
Minghui Qin
Saining Zhang
Moonjun Gong
Shaoting Zhu
Zebang Shen
Luan Zhang
Lu Zhang
Hang Zhao
Hang Zhao
358
3
0
25 Jul 2025
Identifying Prompted Artist Names from Generated Images
Grace Su
Sheng-Yu Wang
Aaron Hertzmann
Eli Shechtman
Jun-Yan Zhu
Richard Zhang
VLM
177
0
0
24 Jul 2025
TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation
Zhekai Chen
Ruihang Chu
Yukang Chen
Shiwei Zhang
Yujie Wei
Yingya Zhang
Xihui Liu
260
8
0
24 Jul 2025
TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance
Minghao Fu
Guo-Hua Wang
Xiaohao Chen
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
228
0
0
24 Jul 2025
Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis
Yanzuo Lu
Yuxi Ren
Xin Xia
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Andy J. Ma
Xiaohua Xie
Jian-Huang Lai
DiffM
272
11
0
24 Jul 2025
Zero-Shot Dynamic Concept Personalization with Grid-Based LoRA
Rameen Abdal
Or Patashnik
Ekaterina Deyneka
Hao Chen
Aliaksandr Siarohin
Sergey Tulyakov
Daniel Cohen-Or
Kfir Aberman
DiffM
VGen
123
3
0
23 Jul 2025
Detail++: Training-Free Detail Enhancer for Text-to-Image Diffusion Models
L. Chen
Jiner Wang
Zihao Pan
B. Zhu
Xiaofeng Yang
Chi Zhang
DiffM
182
1
0
23 Jul 2025
Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Yi Xin
Juncheng Yan
Qi Qin
Ge Wang
Dongyang Liu
...
Jiaming Song
Guangtao Zhai
Xiaohong Liu
Botian Shi
Peng Gao
211
24
0
23 Jul 2025
A Practical Investigation of Spatially-Controlled Image Generation with Transformers
Guoxuan Xia
Harleen Hanspal
Petru-Daniel Tudosiu
Shifeng Zhang
Sarah Parisot
210
0
0
21 Jul 2025
FreeCus: Free Lunch Subject-driven Customization in Diffusion Transformers
Yanbing Zhang
Zhe Wang
Qin Zhou
Mengping Yang
DiffM
140
1
0
21 Jul 2025
Aesthetics is Cheap, Show me the Text: An Empirical Evaluation of State-of-the-Art Generative Models for OCR
Peirong Zhang
Haowei Xu
Jiaxin Zhang
Guitao Xu
Xuhan Zheng
Zhenhua Yang
Junle Liu
Yuyi Zhang
Lianwen Jin
EGVM
296
2
0
20 Jul 2025
SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models
Jiaji Zhang
Ruichao Sun
Hailiang Zhao
Jiaju Wu
Peng Chen
Hao Li
Y. Liu
Xinkui Zhao
Kingsum Chow
Gang Xiong
DiffM
MQ
364
0
0
20 Jul 2025
PositionIC: Unified Position and Identity Consistency for Image Customization
J. Hu
Tianyang Han
K. Ma
Jialin Gao
Hao Dou
Xianhua He
Xianhua He
Jianhui Zhang
Junfeng Luo
DiffM
364
2
0
18 Jul 2025
Imbalance in Balance: Online Concept Balancing in Generation Models
Yukai Shi
Jiarong Ou
Rui Chen
Haotian Yang
Jiahao Wang
Xin Tao
Pengfei Wan
Di Zhang
Kun Gai
216
0
0
17 Jul 2025
VITA: Vision-to-Action Flow Matching Policy
D. Gao
Boqi Zhao
Andrew Lee
Ian Chuang
Hanchu Zhou
Hang Wang
Zhe Zhao
Junshan Zhang
Iman Soltani
VGen
219
3
0
17 Jul 2025
Taming Diffusion Transformer for Efficient Mobile Video Generation in Seconds
Yushu Wu
Yanyu Li
Vidit Goel
Ivan Skorokhodov
Willi Menapace
...
Ju Hu
Aliaksandr Siarohin
Dhritiman Sagar
Yanzhi Wang
Sergey Tulyakov
VGen
256
1
0
17 Jul 2025
Cameras as Relative Positional Encoding
Ruilong Li
Brent Yi
Junchen Liu
Hang Gao
Yi Ma
Angjoo Kanazawa
ViT
246
20
0
14 Jul 2025
Flows and Diffusions on the Neural Manifold
Daniel Saragih
Deyu Cao
Tejas Balaji
DiffM
AI4CE
252
2
0
14 Jul 2025
MP1: MeanFlow Tames Policy Learning in 1-step for Robotic Manipulation
Juyi Sheng
Liang Luo
Peiming Li
Yong Liu
265
5
0
14 Jul 2025
Latent Diffusion Models with Masked AutoEncoders
Junho Lee
Jeongwoo Shin
Hyungwook Choi
Joonseok Lee
DiffM
207
5
0
14 Jul 2025
From Wardrobe to Canvas: Wardrobe Polyptych LoRA for Part-level Controllable Human Image Generation
J. Kim
S. Park
Hyoungwoo Park
Sungrack Yun
Jaegul Choo
Seokeon Choi
DiffM
286
0
0
14 Jul 2025
CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design
Prashant Govindarajan
Davide Baldelli
Jay Pathak
Quentin Fournier
Sarath Chandar
129
7
0
13 Jul 2025
Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers
Wongi Jeong
Kyungryeol Lee
H. Seo
Se Young Chun
209
6
0
11 Jul 2025
Divergence Minimization Preference Optimization for Diffusion Model Alignment
Binxu Li
Minkai Xu
Jiaqi Han
Meihua Dang
Stefano Ermon
EGVM
272
2
0
10 Jul 2025
ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation
Sherry X. Chen
Yi Wei
Luowei Zhou
Suren Kumar
241
3
0
09 Jul 2025
Scaling can lead to compositional generalization
Florian Redhardt
Yassir Akram
Simon Schug
GNN
CoGe
208
0
0
09 Jul 2025
Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching
Huibo Xu
Runlong Yu
Likang Wu
Xianquan Wang
Qi Liu
AI4TS
268
1
0
09 Jul 2025
Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution
Yonghyun Park
Chieh-Hsin Lai
Satoshi Hayakawa
Yuhta Takida
Naoki Murata
Wei-Hsiang Liao
Woosung Choi
K. Cheuk
Junghyun Koo
Yuki Mitsufuji
226
1
0
09 Jul 2025
Integrating Diffusion-based Multi-task Learning with Online Reinforcement Learning for Robust Quadruped Robot Control
Xinyao Qin
Xiaoteng Ma
Yang Qi
Qihan Liu
Chuanyi Xue
Ning Gui
Qinyu Dong
Jun Yang
Bin Liang
235
2
0
08 Jul 2025
Previous
1
2
3
...
10
11
12
...
23
24
25
Next
Page 11 of 25
Page
of 25
Go