Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2403.03206
Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (68 upvotes)
Papers citing
"Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"
50 / 1,219 papers shown
Title
A Connection Between Score Matching and Local Intrinsic Dimension
Eric Yeats
Aaron Jacobson
Darryl Hannan
Yiran Jia
T. Doster
Henry Kvinge
Scott Mahan
DiffM
152
1
0
14 Oct 2025
Diffusion Models for Reinforcement Learning: Foundations, Taxonomy, and Development
Changfu Xu
Jianxiong Guo
Yuzhu Liang
Haiyang Huang
Haodong Zou
Xi Zheng
Shui Yu
Xiaowen Chu
Jiannong Cao
Tian-sheng Wang
OffRL
AI4CE
171
0
0
14 Oct 2025
Into the Unknown: Towards using Generative Models for Sampling Priors of Environment Uncertainty for Planning in Configuration Spaces
Subhransu S. Bhattacharjee
Hao Lu
Dylan Campbell
Rahul Shome
3DPC
100
0
0
13 Oct 2025
Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers
Chaofan Gan
Zicheng Zhao
Yuanpeng Tu
Xi Chen
Ziran Qin
Yun Xu
Mehrtash Harandi
W. Lin
132
0
0
13 Oct 2025
DreamMakeup: Face Makeup Customization using Latent Diffusion Models
Geon Yeong Park
Inhwa Han
Serin Yang
Yeobin Hong
Seongmin Jeong
Heechan Jeon
Myeongjin Goh
Sung Won Yi
Jin Nam
J. C. Ye
DiffM
91
0
0
13 Oct 2025
IUT-Plug: A Plug-in tool for Interleaved Image-Text Generation
Zeteng Lin
X. Li
Wen You
Xiaoyang Li
Zehan Lu
Yujun Cai
Jing Tang
76
0
0
13 Oct 2025
Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling
Tianyi Tan
Yinan Zheng
Ruiming Liang
Zexu Wang
Kexin Zheng
Jinliang Zheng
Jianxiong Li
Xianyuan Zhan
Jingjing Liu
80
3
0
13 Oct 2025
DAWP: A framework for global observation forecasting via Data Assimilation and Weather Prediction in satellite observation space
Junchao Gong
Jingyi Xu
Ben Fei
Zhangrui Li
W. Zhang
Kun Chen
Wanghan Xu
Weidong Yang
Xiaokang Yang
Lei Bai
92
0
0
13 Oct 2025
VLM-Guided Adaptive Negative Prompting for Creative Generation
Shelly Golan
Yotam Nitzan
Zongze Wu
Or Patashnik
DiffM
120
0
0
12 Oct 2025
UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation
Zhengrong Yue
H. Zhang
Xiangyu Zeng
Boyu Chen
Chenting Wang
...
Lu Dong
Kunpeng Du
Yi Wang
Limin Wang
Yali Wang
160
7
0
12 Oct 2025
EditCast3D: Single-Frame-Guided 3D Editing with Video Propagation and View Selection
Huaizhi Qu
Ruichen Zhang
Shuqing Luo
Luchao Qi
Zhihao Zhang
Xiaoming Liu
Roni Sengupta
Tianlong Chen
DiffM
VGen
100
0
0
11 Oct 2025
X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model
Jinliang Zheng
Jianxiong Li
Zhihao Wang
Dongxiu Liu
Xirui Kang
...
Ya-Qin Zhang
Jiangmiao Pang
Jingjing Liu
Tai Wang
Xianyuan Zhan
LM&Ro
204
6
0
11 Oct 2025
ReMix: Towards a Unified View of Consistent Character Generation and Editing
Benjia Zhou
Bin-Bin Fu
Pei Cheng
Y. Wang
Jiayuan Fan
Tao Chen
DiffM
92
0
0
11 Oct 2025
Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images
Chuangchuang Tan
Xiang Ming
Jinglu Wang
Renshuai Tao
Bin Li
Y. X. Wei
Yao Zhao
Yan Lu
72
0
0
11 Oct 2025
Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
Xiaoxiao Ma
Feng Zhao
Pengyang Ling
Haibo Qiu
Zhixiang Wei
Hu Yu
Jie Huang
Zhixiong Zeng
Lin Ma
102
2
0
10 Oct 2025
MMAudioSep: Taming Video-to-Audio Generative Model Towards Video/Text-Queried Sound Separation
Akira Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
88
0
0
10 Oct 2025
Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
Wuyang Li
W. Pan
Po-Chien Luan
Yang Gao
Alexandre Alahi
DiffM
VGen
116
6
0
10 Oct 2025
GTAlign: Game-Theoretic Alignment of LLM Assistants for Social Welfare
Siqi Zhu
David Zhang
Pedro Cisneros-Velarde
J. You
LRM
146
0
0
10 Oct 2025
Reinforcing Diffusion Models by Direct Group Preference Optimization
Yihong Luo
Tianyang Hu
Jing Tang
116
0
0
09 Oct 2025
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution
Shian Du
Menghan Xia
Chang-rui Liu
Quande Liu
Xintao Wang
Pengfei Wan
Xiangyang Ji
VGen
SupR
208
0
0
09 Oct 2025
UniVideo: Unified Understanding, Generation, and Editing for Videos
Cong Wei
Quande Liu
Zixuan Ye
Qiulin Wang
Xintao Wang
Pengfei Wan
Kun Gai
Wenhu Chen
VGen
233
11
0
09 Oct 2025
Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency
Kaiwen Zheng
Y. Wang
Qianli Ma
Huayu Chen
J. Zhang
Yogesh Balaji
Jianfei Chen
Ming-Yu Liu
Jun Zhu
Qinsheng Zhang
DiffM
245
9
0
09 Oct 2025
FlexTraj: Image-to-Video Generation with Flexible Point Trajectory Control
Zhiyuan Zhang
Can Wang
Dongdong Chen
Jing Liao
VGen
236
2
0
09 Oct 2025
Enhancing Reasoning for Diffusion LLMs via Distribution Matching Policy Optimization
Y. Zhu
Wei Guo
Jaemoo Choi
Petr Molodyk
Bo Yuan
Molei Tao
Yongxin Chen
LRM
157
3
0
09 Oct 2025
CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving
Tianrui Zhang
Yichen Liu
Zilin Guo
Yuxin Guo
Jingcheng Ni
Chenjing Ding
Dan Xu
Lewei Lu
Z. Wu
VGen
166
0
0
09 Oct 2025
Computationally-efficient Graph Modeling with Refined Graph Random Features
K. Choromanski
Avinava Dubey
Arijit Sehanobish
Isaac Reid
104
0
0
09 Oct 2025
TTOM: Test-Time Optimization and Memorization for Compositional Video Generation
Leigang Qu
Ziyang Wang
Na Zheng
Wenjie Wang
Liqiang Nie
Tat-Seng Chua
146
1
0
09 Oct 2025
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
Kang Liao
Size Wu
Zhonghua Wu
Linyi Jin
Chao Wang
Y. Wang
Fei Wang
Wei Li
Chen Change Loy
MLLM
VGen
152
2
0
09 Oct 2025
The False Promise of Zero-Shot Super-Resolution in Machine-Learned Operators
Mansi Sakarvadia
Kareem Hegazy
A. Totounferoush
Kyle Chard
Yaoqing Yang
Ian Foster
Michael W. Mahoney
SupR
244
1
0
08 Oct 2025
MATRIX: Mask Track Alignment for Interaction-aware Video Generation
Siyoon Jin
S. Kim
Dahyun Chung
J. Lee
Hyunwook Choi
Jisu Nam
J. Kim
S. Kim
VGen
90
1
0
08 Oct 2025
Efficient High-Resolution Image Editing with Hallucination-Aware Loss and Adaptive Tiling
Young D. Kwon
Abhinav Mehrotra
Malcolm Chadwick
Alberto Gil C. P. Ramos
S. Bhattacharya
DiffM
136
0
0
07 Oct 2025
Riddled basin geometry sets fundamental limits to predictability and reproducibility in deep learning
Andrew Ly
Pulin Gong
AI4CE
180
0
0
07 Oct 2025
SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator
Yuhta Takida
Satoshi Hayakawa
Takashi Shibuya
Masaaki Imaizumi
Naoki Murata
Bac Nguyen
Toshimitsu Uesaka
Chieh-Hsin Lai
Yuki Mitsufuji
DiffM
120
0
0
06 Oct 2025
Asynchronous Denoising Diffusion Models for Aligning Text-to-Image Generation
Zijing Hu
Yunze Tong
Fengda Zhang
Junkun Yuan
Jun Xiao
Kun Kuang
DiffM
159
1
0
06 Oct 2025
Factuality Matters: When Image Generation and Editing Meet Structured Visuals
Le Zhuo
Songhao Han
Yuandong Pu
Boxiang Qiu
Sayak Paul
...
Yihao Liu
Jie Shao
Xi Chen
Si Liu
Hongsheng Li
EGVM
222
1
0
06 Oct 2025
Glocal Information Bottleneck for Time Series Imputation
Jie Yang
Kexin Zhang
Guibin Zhang
Philip S. Yu
Kaize Ding
AI4TS
133
0
0
06 Oct 2025
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation
Longxiang Zhang
Ning Yu
Gordon Chen
Haonan Qiu
P. Debevec
Ziwei Liu
VGen
LRM
69
6
0
06 Oct 2025
SSDD: Single-Step Diffusion Decoder for Efficient Image Tokenization
Théophane Vallaeys
Jakob Verbeek
Matthieu Cord
DiffM
212
3
0
06 Oct 2025
StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation
Mingyu Liu
Jiuhe Shu
Hui Chen
Zeju Li
Canyu Zhao
J. Yang
Shenyuan Gao
Hao Chen
Chunhua Shen
92
1
0
06 Oct 2025
TBStar-Edit: From Image Editing Pattern Shifting to Consistency Enhancement
Hao Fang
Zechao Zhan
Weixin Feng
Ziwei Huang
Xubin Li
Tiezheng Ge
DiffM
310
0
0
06 Oct 2025
SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
Ronen Kamenetsky
Sara Dorfman
Daniel Garibi
Roni Paiss
Or Patashnik
Daniel Cohen-Or
DiffM
287
0
0
06 Oct 2025
TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling
Hyunmin Cho
Donghoon Ahn
S. Hong
J. Kim
Seungryong Kim
Kyong Hwan Jin
DiffM
120
0
0
06 Oct 2025
Scaling Sequence-to-Sequence Generative Neural Rendering
Shikun Liu
Kam Woh Ng
Wonbong Jang
Jiadong Guo
Junlin Han
...
Juan C. Pérez
Zijian Zhou
Chi Phung
Tao Xiang
Juan-Manuel Perez-Rua
VGen
105
0
0
05 Oct 2025
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
J. Wu
Xuanchi Ren
Tianchang Shen
Tianshi Cao
Kai He
...
Jose M. Alvarez
Jun Gao
Sanja Fidler
Zian Wang
Huan Ling
DiffM
VGen
176
3
0
05 Oct 2025
Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL
Ruitao Wu
Yifan Zhao
Guangyao Chen
Jia Li
195
0
0
04 Oct 2025
SALSA-V: Shortcut-Augmented Long-form Synchronized Audio from Videos
Amir Dellali
Luca A. Lanzendörfer
Florian Grötschla
Roger Wattenhofer
VGen
92
0
0
03 Oct 2025
PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
Haoze Sun
Linfeng Jiang
Fan Li
Renjing Pei
Zhixin Wang
...
Huajun Chen
Jin Han
Fenglong Song
Yujiu Yang
Wenbo Li
DiffM
SupR
OffRL
256
1
0
03 Oct 2025
Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models
Tianren Ma
Mu Zhang
Yibing Wang
Qixiang Ye
61
1
0
03 Oct 2025
Learning Robust Diffusion Models from Imprecise Supervision
Dong-Dong Wu
Jiacheng Cui
Wei Wang
Zhiqiang She
Masashi Sugiyama
DiffM
308
0
0
03 Oct 2025
Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner
Cai Zhou
Chenxiao Yang
Yi Hu
Chenyu Wang
Chubin Zhang
Muhan Zhang
Lester Mackey
Tommi Jaakkola
Stephen Bates
Dinghuai Zhang
129
4
0
03 Oct 2025
Previous
1
2
3
4
5
6
...
23
24
25
Next