Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.14389
Cited By
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer
25 March 2023
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer"
21 / 21 papers shown
Title
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
31
0
0
06 May 2025
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DiffM
19
0
0
22 Apr 2025
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
38
0
0
14 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
52
0
0
12 Mar 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
53
0
0
08 Mar 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
71
5
0
27 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
J. Zhu
48
0
0
28 Jan 2025
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
H. Chen
Z. Wang
X. Li
X. Sun
Fangyi Chen
Jiang Liu
J. Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
92
6
0
14 Dec 2024
LaVin-DiT: Large Vision Diffusion Transformer
Zhaoqing Wang
Xiaobo Xia
Runnan Chen
Dongdong Yu
Changhu Wang
M. Gong
Tongliang Liu
92
6
0
18 Nov 2024
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
Hao Phung
Quan Dao
T. Dao
Hoang Phan
Dimitris Metaxas
Anh Tran
Mamba
28
3
0
06 Nov 2024
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
Tariq Berrada Ifriqi
Pietro Astolfi
Melissa Hall
Reyhane Askari Hemmat
Yohann Benchetrit
...
Matthew Muckley
Karteek Alahari
Adriana Romero Soriano
Jakob Verbeek
M. Drozdzal
AI4CE
VLM
30
3
0
05 Nov 2024
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
60
59
0
09 Oct 2024
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
T. Pham
Tri Ton
Chang D. Yoo
28
3
0
03 Oct 2024
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
Haiyu Wu
Jaskirat Singh
Sicong Tian
Liang Zheng
Kevin W. Bowyer
CVBM
29
3
0
04 Sep 2024
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
32
3
0
03 Sep 2024
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
26
4
0
04 Jun 2024
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
Kai Wang
Yukun Zhou
Mingjia Shi
Zhihang Yuan
Yuzhang Shang
Yuzhang Shang
Hanwang Zhang
Hanwang Zhang
Yang You
60
9
0
27 May 2024
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
Seyedmorteza Sadat
Jakob Buhmann
Derek Bradley
Otmar Hilliges
Romann M. Weber
20
8
0
23 May 2024
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
188
515
0
02 Jan 2023
StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
Axel Sauer
Katja Schwarz
Andreas Geiger
174
354
0
01 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
5,353
0
11 Nov 2021
1