ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.09748
  4. Cited By
Scalable Diffusion Models with Transformers

Scalable Diffusion Models with Transformers

19 December 2022
William S. Peebles
Saining Xie
    GNN
ArXivPDFHTML

Papers citing "Scalable Diffusion Models with Transformers"

50 / 463 papers shown
Title
Step1X-Edit: A Practical Framework for General Image Editing
Step1X-Edit: A Practical Framework for General Image Editing
S. Liu
Yucheng Han
Peng Xing
Fukun Yin
Rui Wang
...
Yibo Zhu
Binxing Jiao
X. Zhang
Gang Yu
Daxin Jiang
DiffM
105
3
0
24 Apr 2025
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
Yinqi Li
Hong Chang
Ruibing Hou
Shiguang Shan
Xilin Chen
DiffM
55
0
0
24 Apr 2025
Subject-driven Video Generation via Disentangled Identity and Motion
Subject-driven Video Generation via Disentangled Identity and Motion
Daneul Kim
Jingxu Zhang
W. Jin
Sunghyun Cho
Qi Dai
Jaesik Park
Chong Luo
DiffM
VGen
115
0
0
23 Apr 2025
DreamO: A Unified Framework for Image Customization
DreamO: A Unified Framework for Image Customization
Chong Mou
Yanze Wu
Wenxu Wu
Zinan Guo
Pengze Zhang
...
Shaojin Wu
S. Zhao
Jian Andrew Zhang
Qian He
Xinglong Wu
49
0
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
X. Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Y. Zhang
Ji Wan
J. Wang
VGen
67
1
0
22 Apr 2025
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
Le Zhuo
Liangbing Zhao
Sayak Paul
Yue Liao
Renrui Zhang
Yi Xin
Peng Gao
Mohamed Elhoseiny
H. Li
VLM
75
0
0
22 Apr 2025
DiTPainter: Efficient Video Inpainting with Diffusion Transformers
DiTPainter: Efficient Video Inpainting with Diffusion Transformers
Xian Wu
Chang Liu
DiffM
26
0
0
22 Apr 2025
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DiffM
38
0
0
22 Apr 2025
Structure-guided Diffusion Transformer for Low-Light Image Enhancement
Structure-guided Diffusion Transformer for Low-Light Image Enhancement
Xiangchen Yin
Zhenda Yu
Longtao Jiang
Xin Gao
Xiao Sun
Zhi Liu
Xun Yang
DiffM
47
0
0
21 Apr 2025
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Shilin Zhang
Zican Hu
Wenhao Wu
Xinyi Xie
Jianxiang Tang
Chunlin Chen
Daoyi Dong
Yu Cheng
Zhenhong Sun
Zhi Wang
OffRL
134
0
0
21 Apr 2025
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
Liang Peng
Boxi Wu
Haoran Cheng
Yibo Zhao
Xiaofei He
36
0
0
20 Apr 2025
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis
Jingjing Ren
Wenbo Li
Zhongdao Wang
Haoze Sun
Bangzhen Liu
...
Aoxue Li
Shifeng Zhang
Bin Shao
Yong Guo
Lei Zhu
VGen
43
0
0
20 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
92
0
0
18 Apr 2025
SkyReels-V2: Infinite-length Film Generative Model
SkyReels-V2: Infinite-length Film Generative Model
Guibin Chen
D. Lin
Jiangping Yang
Chunze Lin
J. Zhu
...
Di Qiu
Debang Li
Zhengcong Fei
Yang Li
Yahui Zhou
DiffM
VGen
56
1
0
17 Apr 2025
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Rongtao Xu
J. Zhang
Minghao Guo
Youpeng Wen
H. Yang
...
Liqiong Wang
Yuxuan Kuang
Meng Cao
Feng Zheng
Xiaodan Liang
47
3
0
17 Apr 2025
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
Xin Li
Kun Yuan
B. Li
Fengbin Guan
Yizhen Shao
...
Guohua Zhang
Z. Huang
Y. Deng
Qingmiao Jiang
Lu Chen
55
7
0
17 Apr 2025
Understanding Attention Mechanism in Video Diffusion Models
Understanding Attention Mechanism in Video Diffusion Models
Bingyan Liu
Chengyu Wang
Tongtong Su
Huan Ten
Jun Huang
K. Guo
Kui Jia
VGen
64
0
0
16 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
76
0
0
16 Apr 2025
Decoupled Diffusion Sparks Adaptive Scene Generation
Decoupled Diffusion Sparks Adaptive Scene Generation
Yunsong Zhou
Naisheng Ye
William Ljungbergh
Tianyu Li
Jiazhi Yang
Zetong Yang
Hongzi Zhu
Christoffer Petersson
Hongyang Li
44
1
0
14 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
45
0
0
14 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
83
1
0
11 Apr 2025
Diffusion Transformers for Tabular Data Time Series Generation
Diffusion Transformers for Tabular Data Time Series Generation
Fabrizio Garuti
E. Sangineto
Simone Luetto
L. Forni
Rita Cucchiara
59
0
0
10 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
69
0
0
09 Apr 2025
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
Jiazi Bu
Pengyang Ling
Yujie Zhou
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Y. Cao
D. Lin
Jiaqi Wang
21
0
0
08 Apr 2025
Gaussian Mixture Flow Matching Models
Gaussian Mixture Flow Matching Models
Hansheng Chen
Kai Zhang
Hao Tan
Zexiang Xu
Fujun Luan
Leonidas J. Guibas
Gordon Wetzstein
Sai Bi
DiffM
63
0
0
07 Apr 2025
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Yuandong Pu
Le Zhuo
Kaiwen Zhu
Liangbin Xie
Wenlong Zhang
Xiangyu Chen
Peng Gao
Yu Qiao
Chao Dong
Yihao Liu
MLLM
66
1
0
07 Apr 2025
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Chuning Zhu
Raymond Yu
S. Feng
Benjamin Burchfiel
Paarth Shah
Abhishek Gupta
VGen
57
0
0
03 Apr 2025
Spline-based Transformers
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
41
0
0
03 Apr 2025
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
Yuxuan Luo
Zhengkun Rong
Lizhen Wang
Longhao Zhang
Tianshu Hu
Yongming Zhu
VGen
163
2
0
02 Apr 2025
Biologically Inspired Spiking Diffusion Model with Adaptive Lateral Selection Mechanism
Biologically Inspired Spiking Diffusion Model with Adaptive Lateral Selection Mechanism
Linghao Feng
Dongcheng Zhao
Sicheng Shen
Yi Zeng
67
0
0
31 Mar 2025
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
Rana Muhammad Shahroz Khan
Dongwen Tang
Pingzhi Li
Kai Wang
Tianlong Chen
AI4CE
142
0
0
31 Mar 2025
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Zheng-Peng Duan
Jiawei Zhang
Xin Jin
Z. Zhang
Zheng Xiong
Dongqing Zou
Jimmy S. Ren
Chun-Le Guo
Chongyi Li
42
0
0
30 Mar 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Z. Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
43
0
0
30 Mar 2025
CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving
CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving
Yishen Ji
Ziyue Zhu
Zhenxin Zhu
Kaixin Xiong
Ming Lu
Zhiqi Li
Lijun Zhou
Haiyang Sun
Bing Wang
Tong Lu
VGen
53
1
0
28 Mar 2025
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
Hyunjun Lee
Hyunsoo Lee
Sookwan Han
DiffM
48
0
0
27 Mar 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
W. Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
153
2
0
27 Mar 2025
MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation
MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation
Jinnan Chen
Lingting Zhu
Zeyu Hu
Shengju Qian
Y. Chen
Xin Wang
G. Lee
105
1
0
26 Mar 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
89
0
0
26 Mar 2025
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Jun Zhou
J. Li
Zunnan Xu
Hanhui Li
Yiji Cheng
Fa-Ting Hong
Qin Lin
Qinglin Lu
Xiaodan Liang
DiffM
73
1
0
25 Mar 2025
IPGO: Indirect Prompt Gradient Optimization for Parameter-Efficient Prompt-level Fine-Tuning on Text-to-Image Models
IPGO: Indirect Prompt Gradient Optimization for Parameter-Efficient Prompt-level Fine-Tuning on Text-to-Image Models
Jianping Ye
Michel Wedel
Kunpeng Zhang
37
0
0
25 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
84
2
0
25 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
62
3
0
24 Mar 2025
Training-free Diffusion Acceleration with Bottleneck Sampling
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian
Xin Xia
Yuxi Ren
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Yunhai Tong
L. Yang
Bin Cui
60
0
0
24 Mar 2025
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
54
2
0
24 Mar 2025
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
Zhengyuan Li
Kai Cheng
Anindita Ghosh
Uttaran Bhattacharya
Liangyan Gui
Aniket Bera
DiffM
VGen
44
0
0
23 Mar 2025
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
Fanghua Yu
Jinjin Gu
Jinfan Hu
Zheyuan Li
Chao Dong
DiffM
52
0
0
21 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
69
0
0
20 Mar 2025
Generating Multimodal Driving Scenes via Next-Scene Prediction
Generating Multimodal Driving Scenes via Next-Scene Prediction
Yanhao Wu
Haoyang Zhang
Tianwei Lin
Lichao Huang
Shujie Luo
Rui Wu
Congpei Qiu
Wei Ke
Tong Zhang
VGen
51
0
0
19 Mar 2025
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
Lixing Xiao
Shunlin Lu
Huaijin Pi
Ke Fan
Liang Pan
Yueer Zhou
Ziyong Feng
Xiaowei Zhou
Sida Peng
Jingbo Wang
DiffM
VGen
50
4
0
19 Mar 2025
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
Baiqin Wang
Xiangyu Zhu
Fan Shen
Hao-Xuan Xu
Zhen Lei
63
0
0
18 Mar 2025
Previous
12345...8910
Next