Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.09841
Cited By
v1
v2
v3 (latest)
Taming Transformers for High-Resolution Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2020
17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (6185★)
Papers citing
"Taming Transformers for High-Resolution Image Synthesis"
50 / 2,402 papers shown
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
433
198
0
25 May 2022
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions
ACM Transactions on Graphics (TOG), 2022
Difan Liu
Sandesh Shetty
Tobias Hinz
Matthew Fisher
Richard Y. Zhang
Taesung Park
E. Kalogerakis
ViT
193
42
0
24 May 2022
M6-Fashion: High-Fidelity Multi-modal Image Generation and Editing
Zhikang Li
Huiling Zhou
Shuai Bai
Peike Li
Chang Zhou
Hongxia Yang
184
5
0
24 May 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Neural Information Processing Systems (NeurIPS), 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
1.2K
7,502
0
23 May 2022
Transformer-based out-of-distribution detection for clinically safe segmentation
International Conference on Medical Imaging with Deep Learning (MIDL), 2022
M. Graham
Petru-Daniel Tudosiu
P. Wright
W. H. Pinaya
J. U-King-im
...
H. Jäger
D. Werring
P. Nachev
Sebastien Ourselin
M. Jorge Cardoso
MedIm
160
26
0
21 May 2022
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Neural Information Processing Systems (NeurIPS), 2022
Alexander Kolesnikov
André Susano Pinto
Lucas Beyer
Xiaohua Zhai
Jeremiah Harmsen
N. Houlsby
366
80
0
20 May 2022
Towards Unified Keyframe Propagation Models
Patrick Esser
Peter Michael
Soumyadip Sengupta
VGen
127
0
0
19 May 2022
Masked Image Modeling with Denoising Contrast
International Conference on Learning Representations (ICLR), 2022
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
205
65
0
19 May 2022
BBDM: Image-to-image Translation with Brownian Bridge Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Bo Li
Kaitao Xue
Yinan Han
Yunyu Lai
DiffM
408
237
0
16 May 2022
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
International Conference on Machine Learning (ICML), 2022
Yuhta Takida
Takashi Shibuya
Wei-Hsiang Liao
Chieh-Hsin Lai
Junki Ohmura
Toshimitsu Uesaka
Naoki Murata
Shusuke Takahashi
Toshiyuki Kumakura
Yuki Mitsufuji
BDL
221
90
0
16 May 2022
VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
European Conference on Computer Vision (ECCV), 2022
Yuchao Gu
Xintao Wang
Liangbin Xie
Chao Dong
Gengyan Li
Ying Shan
Mingg-Ming Cheng
199
166
0
13 May 2022
StyLandGAN: A StyleGAN based Landscape Image Synthesis using Depth-map
Gun-Hee Lee
Jonghwa Yim
Chanran Kim
Min-Jung Kim
GAN
MDE
269
2
0
13 May 2022
Deep Learning and Synthetic Media
Raphaël Millière
186
26
0
11 May 2022
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Computer Vision and Pattern Recognition (CVPR), 2022
Qiankun Liu
Zhentao Tan
Dongdong Chen
Qi Chu
Xiyang Dai
Yinpeng Chen
Xiyang Dai
Lu Yuan
Nenghai Yu
ViT
165
86
0
10 May 2022
Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild
Fuyan Ma
Bin Sun
Shutao Li
ViT
117
50
0
10 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
322
290
0
09 May 2022
Seeding Diversity into AI Art
International Conference on Innovative Computing and Cloud Computing (ICCC), 2022
Marvin Zammit
Antonios Liapis
Georgios N. Yannakakis
138
4
0
02 May 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Neural Information Processing Systems (NeurIPS), 2022
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
415
395
0
28 Apr 2022
Can deep learning match the efficiency of human visual long-term memory in storing object details?
Emin Orhan
VLM
OCL
227
0
0
27 Apr 2022
Conformer and Blind Noisy Students for Improved Image Quality Assessment
Marcos V. Conde
Maxime Burchi
Radu Timofte
DiffM
168
15
0
27 Apr 2022
An Overview of Recent Work in Media Forensics: Methods and Threats
Kratika Bhagtani
A. Yadav
Emily R. Bartusiak
Ziyue Xiang
Ruiting Shao
Sriram Baireddy
Edward J. Delp
AAML
286
28
0
26 Apr 2022
Semi-Parametric Neural Image Synthesis
A. Blattmann
Robin Rombach
Kaan Oktay
Jonas Muller
Bjorn Ommer
DiffM
300
32
0
25 Apr 2022
Opal: Multimodal Image Generation for News Illustration
ACM Symposium on User Interface Software and Technology (UIST), 2022
Vivian Liu
Han Qiao
Lydia B. Chilton
291
120
0
19 Apr 2022
CTCNet: A CNN-Transformer Cooperation Network for Face Image Super-Resolution
IEEE Transactions on Image Processing (IEEE TIP), 2022
Guangwei Gao
Zixiang Xu
Juncheng Li
Jian Yang
T. Zeng
Guo-Jun Qi
CVBM
ViT
SupR
294
119
0
19 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
European Conference on Computer Vision (ECCV), 2022
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
482
440
0
18 Apr 2022
Imagination-Augmented Natural Language Understanding
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Yujie Lu
Wanrong Zhu
Xinze Wang
Miguel P. Eckstein
William Yang Wang
218
25
0
18 Apr 2022
Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion
Computer Vision and Pattern Recognition (CVPR), 2022
Evonne Ng
Hanbyul Joo
Liwen Hu
Hao Li
Trevor Darrell
Angjoo Kanazawa
Shiry Ginosar
VGen
205
128
0
18 Apr 2022
Saliency in Augmented Reality
ACM Multimedia (ACM MM), 2022
Huiyu Duan
Wei Shen
Xiongkuo Min
Danyang Tu
Jing Li
Guangtao Zhai
128
40
0
18 Apr 2022
Simultaneous Multiple-Prompt Guided Generation Using Differentiable Optimal Transport
International Conference on Innovative Computing and Cloud Computing (ICCC), 2022
Yingtao Tian
Marco Cuturi
David R Ha
DiffM
OT
128
1
0
18 Apr 2022
Unconditional Image-Text Pair Generation with Multimodal Cross Quantizer
British Machine Vision Conference (BMVC), 2022
Hyungyu Lee
Sungjin Park
Joonseok Lee
Edward Choi
211
4
0
15 Apr 2022
Guided Co-Modulated GAN for 360° Field of View Extrapolation
International Conference on 3D Vision (3DV), 2022
Mohammad Reza Karimi Dastjerdi
Yannick Hold-Geoffroy
Jonathan Eisenmann
Siavash Khodadadeh
Jean-François Lalonde
147
36
0
15 Apr 2022
Any-resolution Training for High-resolution Image Synthesis
European Conference on Computer Vision (ECCV), 2022
Lucy Chai
Michael Gharbi
Eli Shechtman
Phillip Isola
Richard Y. Zhang
235
83
0
14 Apr 2022
An Identity-Preserved Framework for Human Motion Transfer
IEEE Transactions on Information Forensics and Security (IEEE TIFS), 2022
Jingzhe Ma
Xiaoqing Zhang
Shiqi Yu
254
4
0
14 Apr 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
1.1K
8,304
0
13 Apr 2022
No Token Left Behind: Explainability-Aided Image Classification and Generation
European Conference on Computer Vision (ECCV), 2022
Roni Paiss
Hila Chefer
Lior Wolf
VLM
209
35
0
11 Apr 2022
ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Jianan Wang
Guansong Lu
Hang Xu
Zhenguo Li
Chunjing Xu
Yanwei Fu
246
18
0
09 Apr 2022
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
European Conference on Computer Vision (ECCV), 2022
Songwei Ge
Thomas Hayes
Harry Yang
Xiaoyue Yin
Guan Pang
David Jacobs
Jia-Bin Huang
Devi Parikh
ViT
531
271
0
07 Apr 2022
KNN-Diffusion: Image Generation via Large-Scale Retrieval
International Conference on Learning Representations (ICLR), 2022
Shelly Sheynin
Oron Ashual
Adam Polyak
Uriel Singer
Oran Gafni
Eliya Nachmani
Yaniv Taigman
VLM
SyDa
DiffM
238
147
0
06 Apr 2022
Text2LIVE: Text-Driven Layered Image and Video Editing
European Conference on Computer Vision (ECCV), 2022
Omer Bar-Tal
Dolev Ofri-Amar
Rafail Fridman
Yoni Kasten
Tali Dekel
VGen
DiffM
497
370
0
05 Apr 2022
DT2I: Dense Text-to-Image Generation from Region Descriptions
International Conference on Artificial Neural Networks (ICANN), 2022
Stanislav Frolov
Prateek Bansal
Jörn Hees
Andreas Dengel
VLM
165
5
0
05 Apr 2022
Autoregressive 3D Shape Generation via Canonical Mapping
European Conference on Computer Vision (ECCV), 2022
A. Cheng
Xueting Li
Sifei Liu
Min Sun
Ming-Hsuan Yang
3DPC
213
47
0
05 Apr 2022
High-Quality Pluralistic Image Completion via Code Shared VQGAN
Chuanxia Zheng
Guoxian Song
Tat-Jen Cham
Jianfei Cai
Dinh Q. Phung
Linjie Luo
VLM
192
12
0
05 Apr 2022
Quantized GAN for Complex Music Generation from Dance Videos
European Conference on Computer Vision (ECCV), 2022
Ye Zhu
Kyle Olszewski
Yuehua Wu
Panos Achlioptas
Menglei Chai
Yan Yan
Sergey Tulyakov
MGen
228
56
0
01 Apr 2022
Perception Prioritized Training of Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Jooyoung Choi
Jungbeom Lee
Chaehun Shin
Sungwon Kim
Hyunwoo J. Kim
Sung-Hoon Yoon
DiffM
296
327
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Computer Vision and Pattern Recognition (CVPR), 2022
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
168
45
0
31 Mar 2022
VPTR: Efficient Transformers for Video Prediction
International Conference on Pattern Recognition (ICPR), 2022
Xi Ye
Guillaume-Alexandre Bilodeau
ViT
233
28
0
29 Mar 2022
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training
European Conference on Computer Vision (ECCV), 2022
Xiaotong Li
Yixiao Ge
Kun Yi
Zixuan Hu
Ying Shan
Ling-yu Duan
330
44
0
29 Mar 2022
Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation
Computer Vision and Pattern Recognition (CVPR), 2022
Naofumi Akimoto
Yuhi Matsuo
Y. Aoki
213
49
0
28 Mar 2022
Fusing Global and Local Features for Generalized AI-Synthesized Image Detection
International Conference on Information Photonics (ICIP), 2022
Yan Ju
Shan Jia
Lipeng Ke
Hongfei Xue
Koki Nagano
Siwei Lyu
310
105
0
26 Mar 2022
Give Me Your Attention: Dot-Product Attention Considered Harmful for Adversarial Patch Robustness
Computer Vision and Pattern Recognition (CVPR), 2022
Giulio Lovisotto
Nicole Finnie
Mauricio Muñoz
Chaithanya Kumar Mummadi
J. H. Metzen
AAML
ViT
138
48
0
25 Mar 2022
Previous
1
2
3
...
44
45
46
47
48
49
Next