ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.12152
  4. Cited By
All are Worth Words: A ViT Backbone for Diffusion Models

All are Worth Words: A ViT Backbone for Diffusion Models

25 September 2022
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
    VLM
ArXivPDFHTML

Papers citing "All are Worth Words: A ViT Backbone for Diffusion Models"

50 / 65 papers shown
Title
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
D. Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
44
0
0
05 May 2025
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Yunhao Li
Sijing Wu
Wei Sun
Zhichao Zhang
Yucheng Zhu
Zicheng Zhang
Huiyu Duan
Xiongkuo Min
Guangtao Zhai
EGVM
81
0
0
30 Apr 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffM
VGen
75
0
0
30 Apr 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
68
1
0
24 Apr 2025
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DiffM
26
0
0
22 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
87
0
0
18 Apr 2025
COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails
COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails
Miguel Espinosa
V. Marsocci
Yuru Jia
Elliot J. Crowley
Mikolaj Czerkawski
DiffM
47
0
0
11 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
66
0
0
09 Apr 2025
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
Fanghua Yu
Jinjin Gu
Jinfan Hu
Zheyuan Li
Chao Dong
DiffM
50
0
0
21 Mar 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Y. Yang
85
1
0
16 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
66
0
0
13 Mar 2025
Rethinking Diffusion Model in High Dimension
Rethinking Diffusion Model in High Dimension
Zhenxin Zheng
Zhenjie Zheng
DiffM
41
0
0
11 Mar 2025
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Xing Xie
Jiawei Liu
Ziyue Lin
Huijie Fan
Zhi-Long Han
Yandong Tang
Liangqiong Qu
40
0
0
10 Mar 2025
Effective and Efficient Masked Image Generation Models
Effective and Efficient Masked Image Generation Models
Zebin You
Jingyang Ou
Xiaolu Zhang
Jun Hu
Jun Zhou
Chongxuan Li
DiffM
VLM
54
1
0
10 Mar 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
76
6
0
27 Feb 2025
DiffGuard: Text-Based Safety Checker for Diffusion Models
DiffGuard: Text-Based Safety Checker for Diffusion Models
Massine El Khader
Elias Al Bouzidi
Abdellah Oumida
Mohammed Sbaihi
Eliott Binard
Jean-Philippe Poli
Wassila Ouerdane
Boussad Addad
Katarzyna Kapusta
DiffM
114
0
0
20 Feb 2025
Large Language Diffusion Models
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
100
12
0
14 Feb 2025
History-Guided Video Diffusion
Kiwhan Song
Boyuan Chen
Max Simchowitz
Yilun Du
Russ Tedrake
Vincent Sitzmann
VGen
109
7
0
10 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
J. Zhu
55
0
0
28 Jan 2025
Robust Representation Consistency Model via Contrastive Denoising
Robust Representation Consistency Model via Contrastive Denoising
Jiachen Lei
Julius Berner
Jiongxiao Wang
Zhongzhu Chen
Zhongjia Ba
Kui Ren
Jun Zhu
Anima Anandkumar
DiffM
77
0
0
22 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
90
0
0
21 Jan 2025
SOEDiff: Efficient Distillation for Small Object Editing
SOEDiff: Efficient Distillation for Small Object Editing
Yiming Wu
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Ronghua Liang
DiffM
60
0
0
03 Jan 2025
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
142
2
0
14 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
H. Chen
Z. Wang
X. Li
X. Sun
Fangyi Chen
Jiang Liu
J. Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
106
6
0
14 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
83
1
0
11 Dec 2024
LaVin-DiT: Large Vision Diffusion Transformer
Zhaoqing Wang
Xiaobo Xia
Runnan Chen
Dongdong Yu
Changhu Wang
M. Gong
Tongliang Liu
92
6
0
18 Nov 2024
More Expressive Attention with Negative Weights
More Expressive Attention with Negative Weights
Ang Lv
Ruobing Xie
Shuaipeng Li
Jiayi Liao
X. Sun
Zhanhui Kang
Di Wang
Rui Yan
30
0
0
11 Nov 2024
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
Hao Phung
Quan Dao
T. Dao
Hoang Phan
Dimitris Metaxas
Anh Tran
Mamba
62
3
0
06 Nov 2024
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Emiel Hoogeboom
Thomas Mensink
Jonathan Heek
Kay Lamerigts
Ruiqi Gao
Tim Salimans
64
6
0
25 Oct 2024
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
40
2
0
16 Oct 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
51
5
0
15 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Xiangtai Li
Zhen Dong
Lei Zhu
50
13
0
10 Oct 2024
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Songming Liu
Lingxuan Wu
Bangguo Li
Hengkai Tan
Huayu Chen
Zhengyi Wang
Ke Xu
Hang Su
Jun Zhu
29
72
0
10 Oct 2024
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
68
62
0
09 Oct 2024
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang
Jia wei
Pengle Zhang
Jun-Jie Zhu
Jun Zhu
Jianfei Chen
VLM
MQ
82
18
0
03 Oct 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
44
79
0
11 Jun 2024
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
X. Wang
Siming Fu
Qihan Huang
Wanggui He
Hao Jiang
DiffM
36
41
0
11 Jun 2024
PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction
PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction
Eduard Poesina
Adriana Valentina Costache
Adrian-Gabriel Chifu
Josiane Mothe
Radu Tudor Ionescu
VLM
37
1
0
07 Jun 2024
ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs
ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs
Fang Chen
Gourav Datta
Mujahid Al Rafi
Hyeran Jeon
Meng Tang
91
1
0
06 Jun 2024
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Jingyang Ou
Shen Nie
Kaiwen Xue
Fengqi Zhu
Jiacheng Sun
Zhenguo Li
Chongxuan Li
DiffM
41
27
0
06 Jun 2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao
Tongcheng Fang
Haofeng Huang
Enshu Liu
Widyadewi Soedarmadji
...
Shengen Yan
Huazhong Yang
Xuefei Ning
Xuefei Ning
Yu Wang
MQ
VGen
97
22
0
04 Jun 2024
EasyAnimate: A High-Performance Long Video Generation Method based on
  Transformer Architecture
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Jiaqi Xu
Xinyi Zou
Kunzhe Huang
Yunkuo Chen
Bo Liu
Mengli Cheng
Xing Shi
Jun Huang
VGen
32
34
0
29 May 2024
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Zizhao Hu
Mohammad Rostami
27
0
0
25 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
31
2
0
11 May 2024
SparseDM: Toward Sparse Efficient Diffusion Models
SparseDM: Toward Sparse Efficient Diffusion Models
Kafeng Wang
Jianfei Chen
He Li
Zhenpeng Mi
Jun-Jie Zhu
DiffM
49
8
0
16 Apr 2024
YaART: Yet Another ART Rendering Technology
YaART: Yet Another ART Rendering Technology
Sergey Kastryulin
Artem Konev
Alexander Shishenya
Eugene Lyapustin
Artem Khurshudov
...
Dmitrii Kornilov
Mikhail Romanov
Artem Babenko
Sergei Ovcharenko
Valentin Khrulkov
EGVM
28
1
0
08 Apr 2024
IPT-V2: Efficient Image Processing Transformer using Hierarchical
  Attentions
IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions
Zhijun Tu
Kunpeng Du
Hanting Chen
Hai-lin Wang
Wei Li
Jie Hu
Yunhe Wang
ViT
31
4
0
31 Mar 2024
Diffusion Models are Geometry Critics: Single Image 3D Editing Using
  Pre-Trained Diffusion Priors
Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors
Ruicheng Wang
Jianfeng Xiang
Jiaolong Yang
Xin Tong
DiffM
22
4
0
18 Mar 2024
A Comprehensive Survey of Convolutions in Deep Learning: Applications,
  Challenges, and Future Trends
A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends
Abolfazl Younesi
Mohsen Ansari
Mohammadamin Fazli
A. Ejlali
Muhammad Shafique
Joerg Henkel
3DV
38
44
0
23 Feb 2024
Latte: Latent Diffusion Transformer for Video Generation
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Z. Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
123
233
0
05 Jan 2024
12
Next