ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03206
  4. Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
    DiffM
ArXivPDFHTML

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 804 papers shown
Title
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
Yiyang Wang
Xi Chen
Xiaogang Xu
S. Ji
Y. Liu
Yujun Shen
Hengshuang Zhao
DiffM
49
0
0
21 Jan 2025
Block Flow: Learning Straight Flow on Data Blocks
Block Flow: Learning Straight Flow on Data Blocks
Zibin Wang
Zhiyuan Ouyang
Xiangyun Zhang
31
0
0
20 Jan 2025
Lossy Compression with Pretrained Diffusion Models
Jeremy Vonderfecht
Feng Liu
DiffM
92
1
0
20 Jan 2025
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
Yong-Hyun Park
Sangdoo Yun
Jin-Hwa Kim
Junho Kim
Geonhui Jang
Yonghyun Jeong
Junghyo Jo
Gayoung Lee
73
13
0
17 Jan 2025
How Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion Bias
How Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion Bias
Tosin Fadahunsi
Giordano dÁloisio
A. Marco
Federica Sarro
DiffM
61
0
0
15 Jan 2025
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Dongwon Kim
Ju He
Qihang Yu
Chenglin Yang
Xiaohui Shen
Suha Kwak
Liang-Chieh Chen
VLM
46
6
0
13 Jan 2025
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
Michael Toker
Ido Galil
Hadas Orgad
Rinon Gal
Yoad Tewel
Gal Chechik
Yonatan Belinkov
DiffM
54
2
0
12 Jan 2025
Has an AI model been trained on your images?
Has an AI model been trained on your images?
Matyáš Boháček
Hany Farid
33
0
0
11 Jan 2025
VideoAuteur: Towards Long Narrative Video Generation
VideoAuteur: Towards Long Narrative Video Generation
Junfei Xiao
Feng Cheng
Lu Qi
Liangke Gui
Jiepeng Cen
Zhibei Ma
Alan L. Yuille
Lu Jiang
VGen
56
2
0
10 Jan 2025
Multi-subject Open-set Personalization in Video Generation
Multi-subject Open-set Personalization in Video Generation
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Yuwei Fang
Kwot Sin Lee
Ivan Skorokhodov
Kfir Aberman
Jun-Yan Zhu
Ming Yang
Sergey Tulyakov
DiffM
VGen
69
7
0
10 Jan 2025
Generative AI for Cel-Animation: A Survey
Generative AI for Cel-Animation: A Survey
Yunlong Tang
Junjia Guo
Pinxin Liu
Zhiyuan Wang
Hang Hua
...
Jing Bi
Mingqian Feng
X. Li
Zeliang Zhang
Chenliang Xu
VGen
88
7
0
08 Jan 2025
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Dongmin Park
Sebin Kim
Taehong Moon
Minkyu Kim
Kangwook Lee
Jaewoong Cho
DiffM
CoGe
62
2
0
08 Jan 2025
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
Yuzhou Huang
Ziyang Yuan
Quande Liu
Qiulin Wang
Xintao Wang
Ruimao Zhang
Pengfei Wan
Di Zhang
Kun Gai
VGen
DiffM
35
10
0
08 Jan 2025
MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer
MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer
Junsheng Luan
Guangyuan Li
Lei Zhao
Wei Xing
DiffM
35
1
0
07 Jan 2025
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
Chaojie Mao
J. Zhang
Yulin Pan
Zeyinzi Jiang
Zhen Han
Yu Liu
Jingren Zhou
DiffM
46
15
0
05 Jan 2025
Optimizing Noise Schedules of Generative Models in High Dimensionss
Santiago Aranguri
Giulio Biroli
Marc Mézard
Eric Vanden-Eijnden
DiffM
28
1
0
03 Jan 2025
SOEDiff: Efficient Distillation for Small Object Editing
SOEDiff: Efficient Distillation for Small Object Editing
Yiming Wu
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Ronghua Liang
DiffM
60
0
0
03 Jan 2025
Sound-VECaps: Improving Audio Generation with Visual Enhanced Captions
Sound-VECaps: Improving Audio Generation with Visual Enhanced Captions
Yi Yuan
Dongya Jia
Xiaobin Zhuang
Yuanzhe Chen
Zhengxi Liu
...
Y. Wang
Xubo Liu
Xiyuan Kang
Mark D. Plumbley
Wenwu Wang
VLM
53
4
0
03 Jan 2025
DiC: Rethinking Conv3x3 Designs in Diffusion Models
Yuchuan Tian
Jing Han
Chengcheng Wang
Yuchen Liang
Chao Xu
Hanting Chen
DiffM
21
1
0
03 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
122
2
0
03 Jan 2025
Graph Generative Pre-trained Transformer
Xiaohui Chen
Yinkai Wang
Jiaxing He
Yuanqi Du
S. Hassoun
Xiaolin Xu
Li Liu
39
1
0
03 Jan 2025
Local Flow Matching Generative Models
Local Flow Matching Generative Models
Chen Xu
Xiuyuan Cheng
Yao Xie
39
0
0
03 Jan 2025
RORem: Training a Robust Object Remover with Human-in-the-Loop
RORem: Training a Robust Object Remover with Human-in-the-Loop
Ruibin Li
Tao Yang
Song Guo
L. Zhang
40
3
0
01 Jan 2025
Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models
Suttisak Wizadwongsa
Jinfan Zhou
Edward Li
Jeong Joon Park
3DV
55
0
0
31 Dec 2024
M$^3$oralBench: A MultiModal Moral Benchmark for LVLMs
M3^33oralBench: A MultiModal Moral Benchmark for LVLMs
Bei Yan
Jie M. Zhang
Zhiyuan Chen
Shiguang Shan
Xilin Chen
ELM
41
1
0
31 Dec 2024
ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation
ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation
Lujia Zhong
Shuo Huang
Yonggang Shi
41
0
0
31 Dec 2024
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
Chia-Yu Hung
Navonil Majumder
Zhifeng Kong
Ambuj Mehrish
Rafael Valle
Bryan Catanzaro
Soujanya Poria
Bryan Catanzaro
Soujanya Poria
52
5
0
30 Dec 2024
CharGen: High Accurate Character-Level Visual Text Generation Model with
  MultiModal Encoder
CharGen: High Accurate Character-Level Visual Text Generation Model with MultiModal Encoder
Lichen Ma
Tiezhu Yue
Pei Fu
Yujie Zhong
Kai Zhou
Xiaoming Wei
Jie Hu
DiffM
67
2
0
23 Dec 2024
DreamOmni: Unified Image Generation and Editing
DreamOmni: Unified Image Generation and Editing
Bin Xia
Yuechen Zhang
Jingyao Li
Chengyao Wang
Yitong Wang
Xinglong Wu
Bei Yu
Jiaya Jia
SyDa
MLLM
84
3
0
22 Dec 2024
Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
Quan Dao
Hao Phung
T. Dao
Dimitris Metaxas
Anh Tran
90
1
0
22 Dec 2024
Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent
  Diffusion Transformer
Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion Transformer
B. Li
Xihua Wang
Ruihua Song
Wenbing Huang
DiffM
VGen
75
1
0
21 Dec 2024
When Worse is Better: Navigating the compression-generation tradeoff in
  visual tokenization
When Worse is Better: Navigating the compression-generation tradeoff in visual tokenization
Vivek Ramanujan
Kushal Tirumala
Armen Aghajanyan
Luke Zettlemoyer
Ali Farhadi
DiffM
74
2
0
20 Dec 2024
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Ho Kei Cheng
Masato Ishii
Akio Hayakawa
Takashi Shibuya
A. Schwing
Yuki Mitsufuji
VGen
126
12
0
19 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
120
8
0
19 Dec 2024
E-CAR: Efficient Continuous Autoregressive Image Generation via
  Multistage Modeling
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Zhihang Yuan
Yuzhang Shang
H. Zhang
Tongcheng Fang
Rui Xie
Bingxin Xu
Yan Yan
Shengen Yan
Guohao Dai
Yu Wang
DiffM
94
1
0
18 Dec 2024
Towards Generalist Robot Policies: What Matters in Building
  Vision-Language-Action Models
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models
Xinghang Li
Peiyan Li
Minghuan Liu
Dong Wang
Jirong Liu
Bingyi Kang
Xiao Ma
Tao Kong
Hanbo Zhang
Huaping Liu
LM&Ro
91
16
0
18 Dec 2024
Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Hanzhong Guo
Hongwei Yi
Daquan Zhou
Alexander William Bergman
Michael Lingelbach
Yizhou Yu
DiffM
78
1
0
18 Dec 2024
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Hao Li
Shamit Lal
Zhiheng Li
Yusheng Xie
Ying Wang
...
R. Manmatha
Z. Tu
Stefano Ermon
Stefano Soatto
A. Swaminathan
83
0
0
16 Dec 2024
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View
  Diffusion Models
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
Felix Taubner
Ruihang Zhang
Mathieu Tuli
David B. Lindell
68
2
0
16 Dec 2024
IDEA-Bench: How Far are Generative Models from Professional Designing?
IDEA-Bench: How Far are Generative Models from Professional Designing?
C. Liang
Lianghua Huang
Jingwu Fang
Huanzhang Dou
Wei Wang
Zhi-Fan Wu
Yupeng Shi
Junge Zhang
Xin Zhao
Yu Liu
3DV
77
1
0
16 Dec 2024
IGR: Improving Diffusion Model for Garment Restoration from Person Image
IGR: Improving Diffusion Model for Garment Restoration from Person Image
Le Shen
Rong Huang
Zhijie Wang
DiffM
96
2
0
16 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
100
3
0
16 Dec 2024
ColorFlow: Retrieval-Augmented Image Sequence Colorization
ColorFlow: Retrieval-Augmented Image Sequence Colorization
Junhao Zhuang
Xuan Ju
Z. Zhang
Yong-Jin Liu
Shiyi Zhang
Chun Yuan
Ying Shan
DiffM
95
1
0
16 Dec 2024
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Zhao Jin
Dacheng Tao
VGen
97
1
0
16 Dec 2024
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffM
AAML
67
1
0
16 Dec 2024
FlowDock: Geometric Flow Matching for Generative Protein-Ligand Docking and Affinity Prediction
FlowDock: Geometric Flow Matching for Generative Protein-Ligand Docking and Affinity Prediction
Alex Morehead
Jianlin Cheng
OOD
112
1
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
151
2
0
14 Dec 2024
SnapGen-V: Generating a Five-Second Video within Five Seconds on a
  Mobile Device
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Yushu Wu
Zhixing Zhang
Yanyu Li
Yanwu Xu
Anil Kag
...
Ju Hu
Dimitris N. Metaxas
Yanzhi Wang
Sergey Tulyakov
Jian Ren
DiffM
VGen
90
3
0
13 Dec 2024
OFTSR: One-Step Flow for Image Super-Resolution with Tunable
  Fidelity-Realism Trade-offs
OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs
Yuanzhi Zhu
R. Wang
Shilin Lu
Junnan Li
Hanshu Yan
K. Zhang
SupR
76
3
0
12 Dec 2024
Learned Compression for Compressed Learning
Learned Compression for Compressed Learning
Dan G. Jacobellis
N. Yadwadkar
74
0
0
12 Dec 2024
Previous
123...8910...151617
Next