ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03206
  4. Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
    DiffM
ArXivPDFHTML

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 797 papers shown
Title
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing
Tianyi Wei
Yifan Zhou
Dongdong Chen
Xingang Pan
72
0
0
20 Mar 2025
VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling
VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling
Hyojun Go
Byeongjun Park
Hyelin Nam
Byung-Hoon Kim
Hyungjin Chung
Changick Kim
3DGS
VGen
92
1
0
20 Mar 2025
Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation
Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation
Tiange Xiang
Kai Li
Chengjiang Long
Christian Hane
Peihong Guo
Scott Delp
Ehsan Adeli
L. Fei-Fei
DiffM
3DGS
47
0
0
20 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
62
0
0
20 Mar 2025
Scale-wise Distillation of Diffusion Models
Scale-wise Distillation of Diffusion Models
Nikita Starodubcev
Denis Kuznedelev
Artem Babenko
Dmitry Baranchuk
DiffM
48
0
0
20 Mar 2025
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Liming Jiang
Qing Yan
Yumin Jia
Zichuan Liu
Hao Kang
Xin Lu
41
1
0
20 Mar 2025
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
Ruichen Chen
Keith G. Mills
Di Niu
MQ
50
0
0
19 Mar 2025
POSTA: A Go-to Framework for Customized Artistic Poster Generation
POSTA: A Go-to Framework for Customized Artistic Poster Generation
Haoyu Chen
Xiaojie Xu
Wenbo Li
Jingjing Ren
Tian Ye
Songhua Liu
Ying Chen
Lei Zhu
Xinchao Wang
DiffM
57
1
0
19 Mar 2025
Cube: A Roblox View of 3D Intelligence
Cube: A Roblox View of 3D Intelligence
Foundation AI Team Roblox
Kiran Bhat
Nishchaie Khanna
Karun Channa
Tinghui Zhou
...
Kyle Price
Steve Han
Yiqing Wang
A. Singh
David Baszucki
55
0
0
19 Mar 2025
Temporal Regularization Makes Your Video Generator Stronger
Temporal Regularization Makes Your Video Generator Stronger
Harold Haodong Chen
Haojian Huang
Xianfeng Wu
Yexin Liu
Yajing Bai
Wen-Jie Shu
Harry Yang
Ser-Nam Lim
VGen
79
2
0
19 Mar 2025
TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models
TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models
Teng-Fang Hsiao
Bo-Kai Ruan
Yi-Lun Wu
Tzu-Ling Lin
Hong-Han Shuai
VLM
43
0
0
19 Mar 2025
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
Kevin Wang
Ishaan Javali
Michał Bortkiewicz
Tomasz Trzciñski
Benjamin Eysenbach
SSL
OffRL
67
0
0
19 Mar 2025
Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization
Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization
Feifei Li
Mi Zhang
Yiming Sun
Min Yang
DiffM
51
1
0
19 Mar 2025
Efficient Personalization of Quantized Diffusion Model without Backpropagation
Efficient Personalization of Quantized Diffusion Model without Backpropagation
H. Seo
Wongi Jeong
Kyungryeol Lee
Se Young Chun
DiffM
MQ
76
0
0
19 Mar 2025
Unlocking the Capabilities of Vision-Language Models for Generalizable and Explainable Deepfake Detection
Unlocking the Capabilities of Vision-Language Models for Generalizable and Explainable Deepfake Detection
Peipeng Yu
Jianwei Fei
Hui Gao
Xuan Feng
Zhihua Xia
Chip-Hong Chang
MLLM
VLM
70
1
0
19 Mar 2025
The Power of Context: How Multimodality Improves Image Super-Resolution
The Power of Context: How Multimodality Improves Image Super-Resolution
Kangfu Mei
Hossein Talebi
Mojtaba Ardakani
Vishal M. Patel
P. Milanfar
M. Delbracio
DiffM
77
1
0
18 Mar 2025
Deeply Supervised Flow-Based Generative Models
Deeply Supervised Flow-Based Generative Models
Inkyu Shin
Chenglin Yang
Liang-Chieh Chen
58
0
0
18 Mar 2025
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
Yulin Pan
Xiangteng He
Chaojie Mao
Zhen Han
Zeyinzi Jiang
J. Zhang
Yu Liu
EGVM
VLM
71
1
0
18 Mar 2025
ShapeShift: Towards Text-to-Shape Arrangement Synthesis with Content-Aware Geometric Constraints
ShapeShift: Towards Text-to-Shape Arrangement Synthesis with Content-Aware Geometric Constraints
Vihaan Misra
Peter Schaldenbrand
Jean Oh
DiffM
40
1
0
18 Mar 2025
SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
Yucheng Mao
Boyang Wang
Nilesh Kulkarni
Jeong Joon Park
DiffM
58
0
0
18 Mar 2025
Adams Bashforth Moulton Solver for Inversion and Editing in Rectified Flow
Adams Bashforth Moulton Solver for Inversion and Editing in Rectified Flow
Yongjia Ma
Donglin Di
Xuan Liu
Xiaokai Chen
Lei Fan
Wei Chen
Tonghua Su
40
0
0
17 Mar 2025
Edit Transfer: Learning Image Editing via Vision In-Context Relations
Edit Transfer: Learning Image Editing via Vision In-Context Relations
Lan Chen
Qi Mao
Yuchao Gu
Mike Zheng Shou
48
1
0
17 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
87
0
0
17 Mar 2025
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
Yaowei Li
Lingen Li
Zhaoyang Zhang
Xiaoyu Li
Guangzhi Wang
Hongxiang Li
Xiaodong Cun
Ying Shan
Yuexian Zou
DiffM
67
1
0
17 Mar 2025
FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
Minghan Li
C. Xie
Y. Wu
Lei Zhang
M. Wang
DiffM
VGen
50
0
0
17 Mar 2025
TextInVision: Text and Prompt Complexity Driven Visual Text Generation Benchmark
TextInVision: Text and Prompt Complexity Driven Visual Text Generation Benchmark
Forouzan Fallah
Maitreya Patel
Agneet Chatterjee
Vlad I. Morariu
Chitta Baral
Yezhou Yang
CoGe
59
0
0
17 Mar 2025
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation
Yihong Luo
Tianyang Hu
Weijian Luo
Kenji Kawaguchi
Jing Tang
EGVM
73
0
0
17 Mar 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Y. Yang
85
1
0
16 Mar 2025
EditID: Training-Free Editable ID Customization for Text-to-Image Generation
EditID: Training-Free Editable ID Customization for Text-to-Image Generation
Guandong Li
Zhaobin Chu
DiffM
55
0
0
16 Mar 2025
DecompDreamer: Advancing Structured 3D Asset Generation with Multi-Object Decomposition and Gaussian Splatting
DecompDreamer: Advancing Structured 3D Asset Generation with Multi-Object Decomposition and Gaussian Splatting
Utkarsh Nath
Rajeev Goel
Rahul Khurana
Kyle Min
Mark Ollila
P. Turaga
Varun Jampani
Tejaswi Gowda
3DGS
42
0
0
15 Mar 2025
Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System
Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System
Zhiyao Sun
Yu-Hui Wen
M. Lin
Ho-Jui Fang
Sheng Ye
Tian Lv
Y. Liu
80
0
0
15 Mar 2025
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Arsh Koneru
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VLM
56
1
0
15 Mar 2025
Pathology Image Compression with Pre-trained Autoencoders
Srikar Yellapragada
Alexandros Graikos
Kostas Triaridis
Zilinghan Li
Tarak Nandi
Ravi K. Madduri
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
MedIm
33
0
0
14 Mar 2025
Spatio-temporal Fourier Transformer (StFT) for Long-term Dynamics Prediction
Spatio-temporal Fourier Transformer (StFT) for Long-term Dynamics Prediction
Da Long
Shandian Zhe
Samuel Williams
L. Oliker
Zhe Bai
AI4TS
AI4CE
39
0
0
14 Mar 2025
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Kyle Sargent
Kyle Hsu
Justin Johnson
L. Fei-Fei
Jiajun Wu
DiffM
MU
53
2
0
14 Mar 2025
Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities
Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities
Ruchika Chavhan
Abhinav Mehrotra
Malcolm Chadwick
Alberto Gil C. P. Ramos
Luca Morreale
Mehdi Noroozi
Sourav Bhattacharya
44
0
0
14 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
47
0
0
14 Mar 2025
Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Hao-Ran Cheng
Erjia Xiao
Yichi Wang
Kaidi Xu
Mengshu Sun
Jindong Gu
Renjing Xu
36
0
0
14 Mar 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
Sungwoo Cho
J. Choi
Sungnyun Kim
Se-Young Yun
54
0
0
14 Mar 2025
AugGen: Synthetic Augmentation Can Improve Discriminative Models
Parsa Rahimi
Damien Teney
S´ebastien Marcel
64
0
0
14 Mar 2025
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Jianhong Bai
Menghan Xia
Xiao Fu
Xintao Wang
Lianrui Mu
...
Zuozhu Liu
Haoji Hu
Xiang Bai
Pengfei Wan
Di Zhang
DiffM
VGen
43
3
0
14 Mar 2025
Quantifying Interpretability in CLIP Models with Concept Consistency
Avinash Madasu
Vasudev Lal
Phillip Howard
VLM
62
0
0
14 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
72
2
0
13 Mar 2025
Investigating and Improving Counter-Stereotypical Action Relation in Text-to-Image Diffusion Models
Sina Malakouti
Adriana Kovashka
EGVM
67
0
0
13 Mar 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
104
5
0
13 Mar 2025
R^RRFLAV: Rolling Flow matching for infinite Audio Video generation
Alex Ergasti
Giuseppe Tarollo
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
VGen
40
0
0
13 Mar 2025
Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
Yi Wu
Lingting Zhu
Lei Liu
Wandi Qiao
Ziqiang Li
Lequan Yu
Bin Li
DiffM
47
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
76
7
0
13 Mar 2025
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He
Bo Cheng
Yuhang Ma
Qingxiang Jia
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
DiffM
VLM
47
0
0
13 Mar 2025
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
Ziyan Guo
Zeyu Hu
Na Zhao
De Wen Soh
VGen
80
2
0
13 Mar 2025
Previous
123456...141516
Next