Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.14822
Cited By
Vector Quantized Diffusion Model for Text-to-Image Synthesis
29 November 2021
Shuyang Gu
Dong Chen
Jianmin Bao
Fang Wen
Bo Zhang
Dongdong Chen
Lu Yuan
B. Guo
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vector Quantized Diffusion Model for Text-to-Image Synthesis"
50 / 563 papers shown
Title
SceneComposer: Any-Level Semantic Image Synthesis
Yu Zeng
Zhe-nan Lin
Jianming Zhang
Qing Liu
John Collomosse
Jason Kuen
Vishal M. Patel
DiffM
17
47
0
21 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng-Wei Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffM
VLM
23
16
0
21 Nov 2022
Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training
Ling Yang
Zhilin Huang
Yang Song
Shenda Hong
G. Li
Wentao Zhang
Bin Cui
Bernard Ghanem
Ming-Hsuan Yang
17
52
0
21 Nov 2022
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
13
437
0
17 Nov 2022
Super-resolution Reconstruction of Single Image for Latent features
Xin Wang
Jingkai Yan
Jingyong Cai
Jiankang Deng
Qin Qin
Yao Cheng
DiffM
21
8
0
16 Nov 2022
HumanDiffusion: a Coarse-to-Fine Alignment Diffusion Framework for Controllable Text-Driven Person Image Generation
Kai Zhang
Muyi Sun
Jianxin Sun
Binghao Zhao
Kunbo Zhang
Zhenan Sun
T. Tan
DiffM
47
12
0
11 Nov 2022
Self-conditioned Embedding Diffusion for Text Generation
Robin Strudel
Corentin Tallec
Florent Altché
Yilun Du
Yaroslav Ganin
...
Will Grathwohl
Nikolay Savinov
Sander Dieleman
Laurent Sifre
Rémi Leblond
DiffM
13
83
0
08 Nov 2022
CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language
Aditya Sanghi
Rao Fu
Vivian Liu
Karl Willis
Hooman Shayani
Amir Hosein Khasahmadi
Srinath Sridhar
Daniel E. Ritchie
15
51
0
02 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Ming-Yu Liu
VLM
MoE
15
801
0
02 Nov 2022
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
Cheng Lu
Yuhao Zhou
Fan Bao
Jianfei Chen
Chongxuan Li
Jun Zhu
DiffM
28
550
0
02 Nov 2022
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
8
3,233
0
16 Oct 2022
One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations
Yi-Chun Zhu
Hongyu Liu
Yibing Song
Ziyang Yuan
Xintong Han
Chun Yuan
Qifeng Chen
Jue Wang
VLM
DiffM
13
30
0
14 Oct 2022
Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng
Noriyuki Kojima
Alexander M. Rush
DiffM
29
4
0
11 Oct 2022
HORIZON: High-Resolution Semantically Controlled Panorama Synthesis
Kun Yan
Lei Ji
Chenfei Wu
Jian Liang
Ming Zhou
Nan Duan
Shuai Ma
20
0
0
10 Oct 2022
clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP
Justin N. M. Pinkney
Chuan Li
CLIP
VLM
40
19
0
05 Oct 2022
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
67
4
0
05 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
28
16
0
05 Oct 2022
Improving Sample Quality of Diffusion Models Using Self-Attention Guidance
Susung Hong
Gyuseong Lee
Wooseok Jang
Seung Wook Kim
DiffM
19
62
0
03 Oct 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffM
VGen
22
1,339
0
29 Sep 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
114
159
0
29 Sep 2022
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion
Nisha Huang
Fan Tang
Weiming Dong
Changsheng Xu
DiffM
58
40
0
27 Sep 2022
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
VLM
11
312
0
25 Sep 2022
Diffusion Models in Vision: A Survey
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
DiffM
VLM
MedIm
188
1,098
0
10 Sep 2022
Improved Masked Image Generation with Token-Critic
José Lezama
Huiwen Chang
Lu Jiang
Irfan Essa
DiffM
180
43
0
09 Sep 2022
A Survey on Generative Diffusion Model
Hanqun Cao
Cheng Tan
Zhangyang Gao
Yilun Xu
Guangyong Chen
Pheng-Ann Heng
Stan Z. Li
MedIm
37
195
0
06 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
221
1,277
0
02 Sep 2022
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Wanshu Fan
Yen-Chun Chen
Dongdong Chen
Yu Cheng
Lu Yuan
Yu-Chiang Frank Wang
DiffM
15
90
0
29 Aug 2022
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
Pan Xie
Qipeng Zhang
Zexian Li
Hao Tang
Yao Du
Xiaohui Hu
DiffM
36
12
0
19 Aug 2022
Layout-Bridging Text-to-Image Synthesis
Jiadong Liang
Wenjie Pei
Feng Lu
EGVM
14
15
0
12 Aug 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
25
288
0
20 Jul 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu
Jian Liang
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
10
72
0
20 Jul 2022
WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation
Mengping Yang
Zhe Wang
Ziqiu Chi
Wenyi Feng
15
46
0
15 Jul 2022
Vector Quantisation for Robust Segmentation
Ainkaran Santhirasekaram
Avinash Kori
Mathias Winkler
A. Rockall
Ben Glocker
OOD
14
9
0
05 Jul 2022
Semantic Image Synthesis via Diffusion Models
Weilun Wang
Weilun Wang
Wen-gang Zhou
Dongdong Chen
Dong Chen
Lu Yuan
Houqiang Li
DiffM
211
175
0
30 Jun 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
76
1,057
0
22 Jun 2022
StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis
Minguk Kang
Joonghyuk Shin
Jaesik Park
EGVM
6
66
0
19 Jun 2022
Lossy Compression with Gaussian Diffusion
Lucas Theis
Tim Salimans
Matthew D. Hoffman
Fabian Mentzer
DiffM
22
76
0
17 Jun 2022
Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation
Ye Zhu
Yuehua Wu
Kyle Olszewski
Jian Ren
Sergey Tulyakov
Yan Yan
DiffM
20
47
0
15 Jun 2022
Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
21
28
0
09 Jun 2022
Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models
W. H. Pinaya
M. Graham
Robert J. Gray
P. F. D. Costa
Petru-Daniel Tudosiu
...
D. Werring
Geraint Rees
P. Nachev
Sebastien Ourselin
M. Jorge Cardoso
DiffM
MedIm
19
101
0
07 Jun 2022
Blended Latent Diffusion
Omri Avrahami
Ohad Fried
Dani Lischinski
DiffM
47
368
0
06 Jun 2022
Compositional Visual Generation with Composable Diffusion Models
Nan Liu
Shuang Li
Yilun Du
Antonio Torralba
J. Tenenbaum
DiffM
CoGe
18
494
0
03 Jun 2022
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder
Jie Shi
Chenfei Wu
Jian Liang
Xiang Liu
Nan Duan
DiffM
4
25
0
01 Jun 2022
Improved Vector Quantized Diffusion Models
Zhicong Tang
Shuyang Gu
Jianmin Bao
Dong Chen
Fang Wen
DiffM
173
63
0
31 May 2022
Text2Human: Text-Driven Controllable Human Image Generation
Yuming Jiang
Shuai Yang
Haonan Qiu
Wayne Wu
Chen Change Loy
Ziwei Liu
DiffM
107
45
0
31 May 2022
A Continuous Time Framework for Discrete Denoising Models
Andrew Campbell
Joe Benton
Valentin De Bortoli
Tom Rainforth
George Deligiannidis
Arnaud Doucet
DiffM
183
132
0
30 May 2022
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Yichong Leng
Zehua Chen
Junliang Guo
Haohe Liu
Jiawei Chen
...
Lei He
Xiang-Yang Li
Tao Qin
Sheng Zhao
Tie-Yan Liu
DiffM
51
58
0
30 May 2022
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
184
177
0
25 May 2022
Mutual Information Divergence: A Unified Metric for Multimodal Generative Models
Jin-Hwa Kim
Yunji Kim
Jiyoung Lee
Kang Min Yoo
Sang-Woo Lee
EGVM
19
32
0
25 May 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
18
321
0
28 Apr 2022
Previous
1
2
3
...
10
11
12
Next