Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2403.03206
Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (68 upvotes)
Papers citing
"Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"
50 / 1,247 papers shown
I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow
Ruoyi Du
Dongyang Liu
Le Zhuo
Qin Qi
Hongsheng Li
Zhanyu Ma
Peng Gao
295
10
0
10 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Hefei Ling
Zhen Dong
Lei Zhu
521
42
0
10 Oct 2024
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
International Conference on Learning Representations (ICLR), 2024
Onkar Susladkar
Jishu Sen Gupta
Chirag Sehgal
Sparsh Mittal
Rekha Singhal
DiffM
VGen
354
1
0
10 Oct 2024
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
International Conference on Learning Representations (ICLR), 2024
Xinchen Zhang
Ling Yang
Ge Li
Yaqi Cai
Jiake Xie
Yong Tang
Yujiu Yang
Mengdi Wang
Bin Cui
EGVM
CoGe
332
19
0
09 Oct 2024
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
International Conference on Learning Representations (ICLR), 2024
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
712
292
0
09 Oct 2024
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Emmanouil Benetos
Zhikang Niu
Ziyang Ma
Keqi Deng
Chunhui Wang
Jian Zhao
Kai Yu
Xie Chen
613
269
0
09 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
International Conference on Learning Representations (ICLR), 2024
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
517
202
0
08 Oct 2024
Active Fine-Tuning of Multi-Task Policies
Marco Bagatella
Jonas Hübotter
Georg Martius
Andreas Krause
543
0
0
07 Oct 2024
Image Watermarks are Removable Using Controllable Regeneration from Clean Noise
International Conference on Learning Representations (ICLR), 2024
Yepeng Liu
Yiren Song
Hai Ci
Yu Zhang
Haofan Wang
Mike Zheng Shou
Yuheng Bu
WIGM
330
27
0
07 Oct 2024
A Reflection on the Impact of Misspecifying Unidentifiable Causal Inference Models in Surrogate Endpoint Evaluation
Gokce Deliorman
Florian Stijven
Wim Van der Elst
Maria del Carmen Pardo
Ariel Alonso
CML
236
5
0
06 Oct 2024
Is What You Ask For What You Get? Investigating Concept Associations in Text-to-Image Models
Salma Abdel Magid
Weiwei Pan
Simon Warchol
Grace Guo
Junsik Kim
Mahia Rahman
Hanspeter Pfister
470
0
0
06 Oct 2024
Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting
Soon Hoe Lim
Yijin Wang
Annan Yu
Emma Hart
Michael W. Mahoney
Xiaoye S. Li
N. Benjamin Erichson
AI4TS
495
8
0
04 Oct 2024
Stochastic Sampling from Deterministic Flow Models
Saurabh Singh
Ian S. Fischer
251
5
0
03 Oct 2024
Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting
Siyang Li
Yize Chen
Hui Xiong
DiffM
AI4TS
246
1
0
03 Oct 2024
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
International Conference on Learning Representations (ICLR), 2024
Seyedmorteza Sadat
Otmar Hilliges
Romann M. Weber
DiffM
402
48
0
03 Oct 2024
Local Flow Matching Generative Models
Chen Xu
Xiuyuan Cheng
Yao Xie
371
4
0
03 Oct 2024
Selective Attention Improves Transformer
International Conference on Learning Representations (ICLR), 2024
Yaniv Leviathan
Matan Kalman
Yossi Matias
349
20
0
03 Oct 2024
ControlAR: Controllable Image Generation with Autoregressive Models
International Conference on Learning Representations (ICLR), 2024
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
675
36
0
03 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
International Conference on Learning Representations (ICLR), 2024
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
481
5
0
02 Oct 2024
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
International Conference on Learning Representations (ICLR), 2024
Yao Teng
Han Shi
Xian Liu
Xuefei Ning
Guohao Dai
Yu Wang
Zhenguo Li
Xihui Liu
380
42
0
02 Oct 2024
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
Pouyan Navard
Amin Karimi Monsefi
Mengxi Zhou
Wei-Lun Chao
Alper Yilmaz
R. Ramnath
DiffM
356
4
0
02 Oct 2024
Multimodal Pragmatic Jailbreak on Text-to-image Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Tong Liu
Zhixin Lai
Jiawen Wang
Gengyuan Zhang
Shuo Chen
Juil Sock
Vera Demberg
Volker Tresp
Jindong Gu
313
10
0
27 Sep 2024
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner
Neural Information Processing Systems (NeurIPS), 2024
Wenliang Zhao
Minglei Shi
Xumin Yu
Jie Zhou
Jiwen Lu
186
4
0
26 Sep 2024
JoyType: A Robust Design for Multilingual Visual Text Creation
Chao Li
Chen Jiang
Xiaolong Liu
Jun Zhao
Guoxin Wang
DiffM
350
7
0
26 Sep 2024
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
N. Pia
Martin Strauss
M. Multrus
B. Edler
276
3
0
26 Sep 2024
Multi-modal Generative AI: Multi-modal LLMs, Diffusions, and the Unification
X. Wang
Yuwei Zhou
Bin Huang
Hong Chen
Wenwu Zhu
DiffM
491
9
0
23 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
International Conference on Learning Representations (ICLR), 2024
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
566
25
0
23 Sep 2024
Imagine yourself: Tuning-Free Personalized Image Generation
Zecheng He
Bo Sun
Felix Juefei-Xu
Haoyu Ma
Ankit Ramchandani
...
Ning Zhang
Peizhao Zhang
Roshan Sumbaly
Peter Vajda
Animesh Sinha
DiffM
218
25
0
20 Sep 2024
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Yun Wang
Hangting Chen
Dongchao Yang
Zhiyong Wu
Xixin Wu
DiffM
378
8
0
19 Sep 2024
Understanding Implosion in Text-to-Image Generative Models
Conference on Computer and Communications Security (CCS), 2024
Wenxin Ding
Cathy Y. Li
Shawn Shan
Ben Y. Zhao
Haitao Zheng
343
6
0
18 Sep 2024
Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation
Dimitrios Christodoulou
Mads Kuhlmann-Jørgensen
EGVM
181
7
0
18 Sep 2024
Automatic Scene Generation: State-of-the-Art Techniques, Models, Datasets, Challenges, and Future Prospects
IEEE Access (IEEE Access), 2024
Awal Ahmed Fime
Saifuddin Mahmud
Arpita Das
Md. Sunzidul Islam
Hong-Hoon Kim
VGen
3DV
273
2
0
14 Sep 2024
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Ye Bai
Haonan Chen
Jitong Chen
Zhuo Chen
Yi Deng
...
Hang Zhao
Ziyi Zhao
Dejian Zhong
Shicen Zhou
Pei Zou
DiffM
314
18
0
13 Sep 2024
Token Turing Machines are Efficient Vision Models
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Purvish Jajal
Nick Eliopoulos
Benjamin Shiue-Hal Chou
George K. Thiravathukal
James C. Davis
Yung-Hsiang Lu
374
2
0
11 Sep 2024
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Buhua Liu
Shitong Shao
Bao Li
Lichen Bai
Zhiqiang Xu
Haoyi Xiong
James Kwok
Sumi Helal
Bo Han
462
22
0
11 Sep 2024
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
458
3
0
03 Sep 2024
Affordance-based Robot Manipulation with Flow Matching
Fan Zhang
Michael Gienger
692
44
0
02 Sep 2024
Law of Vision Representation in MLLMs
Shijia Yang
Bohan Zhai
Quanzeng You
Jianbo Yuan
Hongxia Yang
Chenfeng Xu
577
15
0
29 Aug 2024
Hand1000: Generating Realistic Hands from Text with Only 1,000 Images
AAAI Conference on Artificial Intelligence (AAAI), 2024
Haozhuo Zhang
B. Zhu
Yu Cao
Y. Hao
VLM
368
7
0
28 Aug 2024
MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement
Xu He
Xiaoyu Li
Di Kang
Jiangnan Ye
Chaopeng Zhang
Liyang Chen
Xiangjun Gao
Han Zhang
Zhiyong Wu
Haolin Zhuang
DiffM
353
16
0
26 Aug 2024
Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations
Aditya Taparia
Som Sagar
Ransalu Senanayake
FAtt
419
3
0
24 Aug 2024
Taming Text-to-Image Synthesis for Novices: User-centric Prompt Generation via Multi-turn Guidance
Yilun Liu
Minggui He
Feiyu Yao
Yuhe Ji
Shimin Tao
...
Jian Gao
Li Zhang
Hao Yang
Boxing Chen
Osamu Yoshie
301
7
0
23 Aug 2024
MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration
AAAI Conference on Artificial Intelligence (AAAI), 2024
Yanbo Ding
Shaobin Zhuang
Kunchang Li
Zhengrong Yue
Yu Qiao
Yali Wang
VGen
315
5
0
20 Aug 2024
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
Peiming Guo
Sinuo Liu
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Hao Fei
DiffM
338
1
0
16 Aug 2024
Hybrid SD: Edge-Cloud Collaborative Inference for Stable Diffusion Models
Chenqian Yan
Songwei Liu
Hongjian Liu
Xurui Peng
Xiaojian Wang
Fangming Chen
Xing Mei
Lean Fu
339
12
0
13 Aug 2024
Music2Latent: Consistency Autoencoders for Latent Audio Compression
International Society for Music Information Retrieval Conference (ISMIR), 2024
Marco Pasini
Stefan Lattner
George Fazekas
229
32
0
12 Aug 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
International Conference on Learning Representations (ICLR), 2024
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffM
VGen
860
1,293
0
12 Aug 2024
IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts
Ciara Rowles
Shimon Vainer
Dante De Nigris
Slava Elizarov
Konstantin Kutsy
Simon Donné
DiffM
283
14
0
06 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
414
111
0
05 Aug 2024
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation
Xinhan Di
Jiahao Lu
Yunming Liang
Junjie Zheng
Yihua Wang
Chaofan Ding
ALM
275
3
0
01 Aug 2024
Previous
1
2
3
...
23
24
25
Next