ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03206
  4. Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
    DiffM
ArXiv (abs)PDFHTMLHuggingFace (68 upvotes)

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 1,247 papers shown
RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild
RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild
Jingkai Zhou
Yifan Wu
Shikai Li
Min Wei
Chao Fan
Weihua Chen
Wei Jiang
Fan Wang
VGen
263
12
0
21 Apr 2025
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis
Jingjing Ren
Wenbo Li
Zhongdao Wang
Haoze Sun
Bangzhen Liu
...
Aoxue Li
Shifeng Zhang
Bin Shao
Yong Guo
Lei Zhu
VGen
313
7
0
20 Apr 2025
Towards Explainable Fake Image Detection with Multi-Modal Large Language Models
Towards Explainable Fake Image Detection with Multi-Modal Large Language Models
Yikun Ji
Y. Hong
Jiahui Zhan
H. Chen
Jun Lan
Huijia Zhu
Weiqiang Wang
Guang Dai
Jianfu Zhang
MLLMLRM
516
4
0
19 Apr 2025
PRISM: A Unified Framework for Photorealistic Reconstruction and Intrinsic Scene Modeling
PRISM: A Unified Framework for Photorealistic Reconstruction and Intrinsic Scene Modeling
Alara Dirik
Tuanfeng Y. Wang
Duygu Ceylan
Stefanos Zafeiriou
Anna Frühstück
DiffM
248
5
0
19 Apr 2025
The Path to Reconciling Quality and Safety in Text-to-Image Generation: Dataset, Method, and Evaluation
The Path to Reconciling Quality and Safety in Text-to-Image Generation: Dataset, Method, and Evaluation
Shouwei Ruan
Zhenyu Wu
Yao Huang
Ruochen Zhang
Yitong Sun
Caixin Kang
Shiji Zhao
Xingxing Wei
EGVM
406
1
0
19 Apr 2025
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Zichuan Liu
Liming Jiang
Qing Yan
Yumin Jia
Hao Kang
Xin Lu
DiffM
388
1
0
19 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
432
5
0
18 Apr 2025
Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing
Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing
Joowon Kim
Ziseok Lee
Donghyeon Cho
Sanghyun Jo
Y. Jung
Kyungsu Kim
Eunho Yang
DiffM
291
1
0
18 Apr 2025
Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation
Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation
Fulvio Sanguigni
Davide Morelli
Marcella Cornia
Rita Cucchiara
DiffM
200
3
0
18 Apr 2025
Probing and Inducing Combinational Creativity in Vision-Language Models
Probing and Inducing Combinational Creativity in Vision-Language Models
Yongqian Peng
Yuxi Ma
Minghua Yi
Yuxuan Wang
Yizhou Wang
Chuxu Zhang
Yixin Zhu
Zilong Zheng
MLLMCoGe
464
3
0
17 Apr 2025
MGT: Extending Virtual Try-Off to Multi-Garment Scenarios
MGT: Extending Virtual Try-Off to Multi-Garment Scenarios
Riza Velioglu
Petra Bevandic
Robin Chan
Barbara Hammer
DiffM
271
0
0
17 Apr 2025
SkyReels-V2: Infinite-length Film Generative Model
SkyReels-V2: Infinite-length Film Generative Model
Guibin Chen
D. Lin
Jiangping Yang
Chunze Lin
J. Zhu
...
Di Qiu
Debang Li
Zhengcong Fei
Yang Li
Yahui Zhou
DiffMVGen
510
76
0
17 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
488
6
0
16 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual PerceptionInternational Conference on Learning Representations (ICLR), 2025
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
486
1
0
15 Apr 2025
ADT: Tuning Diffusion Models with Adversarial Supervision
ADT: Tuning Diffusion Models with Adversarial Supervision
Dazhong Shen
Guanglu Song
Yuanxing Zhang
Bingqi Ma
Lujundong Li
Shihong Deng
Zhuofan Zong
Y. Liu
DiffM
347
3
0
15 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
416
1
0
14 Apr 2025
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu
Linxuan Li
Kai Wang
Yaxing Wang
Jian Yang
Ming-Ming Cheng
DiffMVGen
298
4
0
14 Apr 2025
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
Yushu Wu
Yanyu Li
Ivan Skorokhodov
Vidit Goel
Willi Menapace
Sharath Girish
Aliaksandr Siarohin
Yanzhi Wang
Sergey Tulyakov
DiffMVGen
377
5
0
14 Apr 2025
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
Xingjian Leng
Jaskirat Singh
Yunzhong Hou
Zhenchang Xing
Saining Xie
Liang Zheng
415
68
0
14 Apr 2025
Efficient Generative Model Training via Embedded Representation Warmup
Efficient Generative Model Training via Embedded Representation Warmup
Deyuan Liu
Peng Sun
Xufeng Li
Tao Lin
479
0
0
14 Apr 2025
On Equivariance and Fast Sampling in Video Diffusion Models Trained with Warped Noise
On Equivariance and Fast Sampling in Video Diffusion Models Trained with Warped Noise
Chao Liu
Arash Vahdat
DiffMVGen
388
5
0
14 Apr 2025
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Shengao Wang
Arjun Chandra
Aoming Liu
Venkatesh Saligrama
Boqing Gong
MLLMVLM
311
1
0
13 Apr 2025
Flux Already Knows -- Activating Subject-Driven Image Generation without Training
Flux Already Knows -- Activating Subject-Driven Image Generation without Training
Hao Kang
Stathi Fotiadis
Liming Jiang
Qing Yan
Yumin Jia
Zichuan Liu
Min Jin Chong
Xin Lu
308
9
0
12 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
579
63
0
11 Apr 2025
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows
DiverseFlow: Sample-Efficient Diverse Mode Coverage in FlowsComputer Vision and Pattern Recognition (CVPR), 2025
Mashrur M. Morshed
Vishnu Boddeti
275
6
0
10 Apr 2025
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
Zhong-Yu Li
Ruoyi Du
Juncheng Yan
Le Zhuo
Zhen Li
Peng Gao
Zhanyu Ma
Ming-Ming Cheng
Ming-Ming Cheng
VLM
364
20
0
10 Apr 2025
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text RenderingComputer Vision and Pattern Recognition (CVPR), 2025
Y. Gao
Zihang Lin
Chuanbin Liu
Min Zhou
Bo Xiao
Bo Zheng
Hongtao Xie
DiffM
347
21
0
09 Apr 2025
DyDiT++: Diffusion Transformers with Timestep and Spatial Dynamics for Efficient Visual Generation
DyDiT++: Diffusion Transformers with Timestep and Spatial Dynamics for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Xiaojiang Peng
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
587
3
0
09 Apr 2025
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton
Ji Woo Hong
Chang D. Yoo
VGen
278
3
0
08 Apr 2025
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
Jiazi Bu
Pengyang Ling
Yujie Zhou
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Dahua Lin
Jiaqi Wang
298
6
0
08 Apr 2025
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Yuandong Pu
Le Zhuo
Kaiwen Zhu
Liangbin Xie
Wenlong Zhang
Xiangyu Chen
Peng Gao
Botian Shi
Chao Dong
Yihao Liu
MLLM
320
10
0
07 Apr 2025
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Hui Liu
Bin Zou
Suiyun Zhang
Kecheng Chen
Rui Liu
Haoliang Li
DiffM
242
0
0
07 Apr 2025
CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation
CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation
Kavana Venkatesh
Connor Dunlop
Pinar Yanardag
DiffM
351
3
0
07 Apr 2025
Gaussian Mixture Flow Matching Models
Gaussian Mixture Flow Matching Models
Hansheng Chen
Kai Zhang
Hao Tan
Zexiang Xu
Fujun Luan
Leonidas Guibas
Gordon Wetzstein
Sai Bi
DiffM
460
8
0
07 Apr 2025
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Mengchao Wang
Qiang Wang
Fan Jiang
Yaqi Fan
Yunpeng Zhang
Yonggang Qi
Kun Zhao
Mu Xu
DiffMVGen
214
43
0
07 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffMVGen
316
3
0
05 Apr 2025
SDEIT: Semantic-Driven Electrical Impedance Tomography
SDEIT: Semantic-Driven Electrical Impedance Tomography
Dong Liu
Yuanchao Wu
Bowen Tong
Jiansong Deng
DiffM
267
0
0
05 Apr 2025
Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models
Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models
Ved Umrajkar
Aakash Kumar Singh
232
0
0
04 Apr 2025
Conditioning Diffusions Using Malliavin Calculus
Conditioning Diffusions Using Malliavin Calculus
Jakiw Pidstrigach
Elizabeth Baker
Carles Domingo-Enrich
George Deligiannidis
Nikolas Nüsken
DiffM
350
2
0
04 Apr 2025
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Jiwoo Chung
Sangeek Hyun
Hyunjun Kim
Eunseo Koh
MinKyu Lee
Jae-Pil Heo
321
9
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Lichao Sun
MLLMEGVM
524
54
0
03 Apr 2025
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image SynthesisComputer Vision and Pattern Recognition (CVPR), 2025
Zixuan Wang
Duo Peng
Feng Chen
Yue Yang
Yinjie Lei
DiffM
373
5
0
02 Apr 2025
FlowR: Flowing from Sparse to Dense 3D Reconstructions
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
373
7
0
02 Apr 2025
Watermarking for AI Content Detection: A Review on Text, Visual, and Audio Modalities
Watermarking for AI Content Detection: A Review on Text, Visual, and Audio Modalities
Lele Cao
256
2
0
02 Apr 2025
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
Yuxuan Luo
Zhengkun Rong
Lizhen Wang
Longhao Zhang
Tianshu Hu
Yongming Zhu
VGen
1.0K
22
0
02 Apr 2025
Distilling Multi-view Diffusion Models into 3D Generators
Distilling Multi-view Diffusion Models into 3D Generators
Hao Qin
Luyuan Chen
Ming Kong
Mengxu Lu
Qiang Zhu
3DGS
541
1
0
01 Apr 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Zheyu Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
655
16
0
30 Mar 2025
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Zheng-Peng Duan
Jiawei Zhang
Xin Jin
Zhe Zhang
Zheng Xiong
Dongqing Zou
Jimmy S. Ren
Chun-Le Guo
Chongyi Li
413
15
0
30 Mar 2025
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Zeynep Akata
396
7
0
30 Mar 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
Jiadong Wang
Tao Dai
Shu-Tao Xia
Luca Benini
439
16
0
30 Mar 2025
Previous
123...161718...232425
Next
Page 17 of 25
Pageof 25