ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03206
  4. Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
    DiffM
ArXivPDFHTML

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 796 papers shown
Title
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
Pascal Chang
Sergio Sancho
Jingwei Tang
Markus Gross
Vinicius Azevedo
28
0
0
11 Apr 2025
MixDiT: Accelerating Image Diffusion Transformer Inference with Mixed-Precision MX Quantization
MixDiT: Accelerating Image Diffusion Transformer Inference with Mixed-Precision MX Quantization
Daeun Kim
Jinwoo Hwang
Changhun Oh
Jongse Park
MQ
35
0
0
11 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
77
1
0
11 Apr 2025
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows
Mashrur M. Morshed
Vishnu Boddeti
33
0
0
10 Apr 2025
PixelFlow: Pixel-Space Generative Models with Flow
PixelFlow: Pixel-Space Generative Models with Flow
Shoufa Chen
Chongjian Ge
Shilong Zhang
Peize Sun
Ping Luo
VLM
DRL
33
0
0
10 Apr 2025
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
Zhong-Yu Li
Ruoyi Du
Juncheng Yan
Le Zhuo
Zhen Li
Peng Gao
Zhanyu Ma
Ming-Ming Cheng
VLM
68
2
0
10 Apr 2025
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation
Diljeet Jagpal
Xi Chen
Vinay P. Namboodiri
DiffM
VGen
43
0
0
09 Apr 2025
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang
Fengqi Liu
Yixing Lu
Qin Zhao
Pingyu Wu
...
Ran Yi
Yang Cao
Lizhuang Ma
Zheng-jun Zha
Junting Dong
3DGS
40
0
0
09 Apr 2025
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
Y. Gao
Zihang Lin
Chuanbin Liu
Min Zhou
T. Ge
Bo Zheng
Hongtao Xie
DiffM
35
0
0
09 Apr 2025
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
Jonas Loos
Lorenz Linhardt
26
0
0
09 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
63
0
0
09 Apr 2025
Flash Sculptor: Modular 3D Worlds from Objects
Flash Sculptor: Modular 3D Worlds from Objects
Yujia Hu
Songhua Liu
Xingyi Yang
Xinchao Wang
34
0
0
08 Apr 2025
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton
Ji Woo Hong
Chang D. Yoo
VGen
24
0
0
08 Apr 2025
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
Jiazi Bu
Pengyang Ling
Yujie Zhou
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Y. Cao
D. Lin
Jiaqi Wang
19
0
0
08 Apr 2025
Transfer between Modalities with MetaQueries
Transfer between Modalities with MetaQueries
Xichen Pan
Satya Narayan Shukla
Aashu Singh
Zhuokai Zhao
Shlok Kumar Mishra
...
Jiuhai Chen
Kunpeng Li
F. Xu
Ji Hou
Saining Xie
DiffM
41
6
0
08 Apr 2025
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Hui Liu
Bin Zou
Suiyun Zhang
Kecheng Chen
Rui Liu
Haoliang Li
DiffM
64
0
0
07 Apr 2025
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
Thanos Delatolas
Vicky S. Kalogeiton
Dim P. Papadopoulos
DiffM
VOS
43
1
0
07 Apr 2025
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Yuandong Pu
Le Zhuo
Kaiwen Zhu
Liangbin Xie
Wenlong Zhang
Xiangyu Chen
Peng Gao
Yu Qiao
Chao Dong
Yihao Liu
MLLM
59
1
0
07 Apr 2025
Gaussian Mixture Flow Matching Models
Gaussian Mixture Flow Matching Models
Hansheng Chen
Kai Zhang
Hao Tan
Zexiang Xu
Fujun Luan
Leonidas J. Guibas
Gordon Wetzstein
Sai Bi
DiffM
61
0
0
07 Apr 2025
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Mengchao Wang
Qiang Wang
Fan Jiang
Yaqi Fan
Yunpeng Zhang
Yonggang Qi
Kun Zhao
Mu Xu
DiffM
VGen
29
0
0
07 Apr 2025
CREA: A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models
CREA: A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models
Kavana Venkatesh
Connor Dunlop
Pinar Yanardag
DiffM
27
0
0
07 Apr 2025
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
Yang Jiao
Haibo Qiu
Zequn Jie
S. Chen
Jingjing Chen
Lin Ma
Yu Jiang
26
2
0
06 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffM
VGen
23
0
0
05 Apr 2025
SDEIT: Semantic-Driven Electrical Impedance Tomography
SDEIT: Semantic-Driven Electrical Impedance Tomography
Dong Liu
Yuanchao Wu
Bowen Tong
Jiansong Deng
DiffM
28
0
0
05 Apr 2025
Generating ensembles of spatially-coherent in-situ forecasts using flow matching
Generating ensembles of spatially-coherent in-situ forecasts using flow matching
David Landry
C. Monteleoni
A. Charantonis
60
0
0
04 Apr 2025
Conditioning Diffusions Using Malliavin Calculus
Conditioning Diffusions Using Malliavin Calculus
Jakiw Pidstrigach
Elizabeth Baker
Carles Domingo-Enrich
George Deligiannidis
Nikolas Nüsken
DiffM
30
0
0
04 Apr 2025
Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models
Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models
Ved Umrajkar
Aakash Kumar Singh
21
0
0
04 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
88
8
0
03 Apr 2025
OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication
OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication
Zhongjian Wang
Peng Zhang
Jinwei Qi
Guangyuan Wang Sheng Xu
Bang Zhang
Liefeng Bo
DiffM
VGen
36
0
0
03 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
63
2
0
03 Apr 2025
FlowR: Flowing from Sparse to Dense 3D Reconstructions
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
48
0
0
02 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
50
4
0
02 Apr 2025
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Zixuan Wang
Duo Peng
Feng Chen
Y. Yang
Yinjie Lei
DiffM
74
0
0
02 Apr 2025
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
Yuxuan Luo
Zhengkun Rong
Lizhen Wang
Longhao Zhang
Tianshu Hu
Yongming Zhu
VGen
68
0
0
02 Apr 2025
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
Jincheng Zhong
Xiangcheng Zhang
J. Z. Wang
Mingsheng Long
35
1
0
02 Apr 2025
WorldPrompter: Traversable Text-to-Scene Generation
WorldPrompter: Traversable Text-to-Scene Generation
Zhaoyang Zhang
Yannick Hold-Geoffroy
Miloš Hašan
Chen Ziwen
Fujun Luan
Julie Dorsey
Yiwei Hu
VGen
45
0
0
02 Apr 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
29
0
0
02 Apr 2025
Watermarking for AI Content Detection: A Review on Text, Visual, and Audio Modalities
Watermarking for AI Content Detection: A Review on Text, Visual, and Audio Modalities
Lele Cao
36
0
0
02 Apr 2025
Distilling Multi-view Diffusion Models into 3D Generators
Distilling Multi-view Diffusion Models into 3D Generators
Hao Qin
Luyuan Chen
Ming Kong
Mengxu Lu
Qiang Zhu
3DGS
64
0
0
01 Apr 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
J. Wang
Tao Dai
Shu-Tao Xia
Luca Benini
64
1
0
30 Mar 2025
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Zeynep Akata
40
0
0
30 Mar 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Z. Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
38
0
0
30 Mar 2025
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Zheng-Peng Duan
Jiawei Zhang
Xin Jin
Z. Zhang
Zheng Xiong
Dongqing Zou
Jimmy S. Ren
Chun-Le Guo
Chongyi Li
37
0
0
30 Mar 2025
On Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
On Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
H. Seo
Junseo Bang
Haechang Lee
Joohoon Lee
Byung Hyun Lee
Se Young Chun
46
0
0
29 Mar 2025
Synthetic Art Generation and DeepFake Detection A Study on Jamini Roy Inspired Dataset
Synthetic Art Generation and DeepFake Detection A Study on Jamini Roy Inspired Dataset
Kushal Agrawal
Romi Banerjee
41
0
0
29 Mar 2025
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
Xianglong He
Junyi Chen
Di Huang
Zexiang Liu
Xiaoshui Huang
Wanli Ouyang
C. Yuan
Yangguang Li
DiffM
52
0
0
29 Mar 2025
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
Hadrien Reynaud
Alberto Gomez
Paul Leeson
Qingjie Meng
B. Kainz
MedIm
54
0
0
28 Mar 2025
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Minho Park
S. Park
Jungsoo Lee
Hyojin Park
Kyuwoong Hwang
Fatih Porikli
Jaegul Choo
Sungha Choi
29
0
0
28 Mar 2025
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
H. Zhang
R. Su
Zhihang Yuan
Pengtao Chen
Mingzhu Shen Yibo Fan
Shengen Yan
Guohao Dai
Yu Wang
39
0
0
28 Mar 2025
Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets
Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets
Martin Kiss
Michal Hradiš
34
0
0
28 Mar 2025
Previous
123456...141516
Next