ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.09841
  4. Cited By
Taming Transformers for High-Resolution Image Synthesis
v1v2v3 (latest)

Taming Transformers for High-Resolution Image Synthesis

Computer Vision and Pattern Recognition (CVPR), 2020
17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
    ViT
ArXiv (abs)PDFHTMLGithub (6185★)

Papers citing "Taming Transformers for High-Resolution Image Synthesis"

50 / 2,382 papers shown
Title
Contextformer: A Transformer with Spatio-Channel Attention for Context
  Modeling in Learned Image Compression
Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image CompressionEuropean Conference on Computer Vision (ECCV), 2022
A. B. Koyuncu
Han Gao
Atanas Boev
Georgii Gaikov
Elena Alshina
Eckehard Steinbach
ViT
195
78
0
04 Mar 2022
Detecting GAN-generated Images by Orthogonal Training of Multiple CNNs
Detecting GAN-generated Images by Orthogonal Training of Multiple CNNsInternational Conference on Information Photonics (ICIP), 2022
S. Mandelli
Nicolo Bonettini
Paolo Bestagini
Stefano Tubaro
161
73
0
04 Mar 2022
Autoregressive Image Generation using Residual Quantization
Autoregressive Image Generation using Residual QuantizationComputer Vision and Pattern Recognition (CVPR), 2022
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
VGen
953
562
0
03 Mar 2022
Multi-Tailed Vision Transformer for Efficient Inference
Multi-Tailed Vision Transformer for Efficient InferenceNeural Networks (NN), 2022
Yunke Wang
Bo Du
Wenyuan Wang
Chang Xu
ViT
518
10
0
03 Mar 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking
  Positional Encoding
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional EncodingComputer Vision and Pattern Recognition (CVPR), 2022
Qiaole Dong
Chenjie Cao
Yanwei Fu
CLL
310
179
0
02 Mar 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
Zihao Wang
Wei Liu
Qian He
Xin-ru Wu
Zili Yi
CLIPVLM
505
90
0
01 Mar 2022
One-shot Ultra-high-Resolution Generative Adversarial Network That
  Synthesizes 16K Images On A Single GPU
One-shot Ultra-high-Resolution Generative Adversarial Network That Synthesizes 16K Images On A Single GPUImage and Vision Computing (IVC), 2022
Junseok Oh
Donghwee Yoon
Injung Kim
296
2
0
28 Feb 2022
Real-World Blind Super-Resolution via Feature Matching with Implicit
  High-Resolution Priors
Real-World Blind Super-Resolution via Feature Matching with Implicit High-Resolution PriorsACM Multimedia (ACM MM), 2022
Chaofeng Chen
Xinyu Shi
Yipeng Qin
Xiaoming Li
Xiaoguang Han
Taojiannan Yang
Shihui Guo
273
155
0
26 Feb 2022
Retriever: Learning Content-Style Representation as a Token-Level
  Bipartite Graph
Retriever: Learning Content-Style Representation as a Token-Level Bipartite GraphInternational Conference on Learning Representations (ICLR), 2022
Dacheng Yin
Xuanchi Ren
Chong Luo
Yuwang Wang
Zhiwei Xiong
Wenjun Zeng
247
13
0
24 Feb 2022
An Introduction to Neural Data Compression
An Introduction to Neural Data CompressionFoundations and Trends in Computer Graphics and Vision (Found. Trends Comput. Graph. Vis.), 2022
Jianlong Wu
Stephan Mandt
Lucas Theis
375
139
0
14 Feb 2022
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGANComputer Vision and Pattern Recognition (CVPR), 2022
Minheng Ni
Chenfei Wu
Haoyang Huang
Daxin Jiang
W. Zuo
Nan Duan
136
27
0
10 Feb 2022
Diffusion bridges vector quantized Variational AutoEncoders
Diffusion bridges vector quantized Variational AutoEncodersInternational Conference on Machine Learning (ICML), 2022
Max H. Cohen
Guillaume Quispe
Sylvain Le Corff
Charles Ollion
Eric Moulines
DiffM
238
16
0
10 Feb 2022
MaskGIT: Masked Generative Image Transformer
MaskGIT: Masked Generative Image TransformerComputer Vision and Pattern Recognition (CVPR), 2022
Huiwen Chang
Han Zhang
Lu Jiang
Ce Liu
William T. Freeman
ViT
494
934
0
08 Feb 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of
  Text-to-Image Generation Models
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation ModelsIEEE International Conference on Computer Vision (ICCV), 2022
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
ViT
443
257
0
08 Feb 2022
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Corrupted Image Modeling for Self-Supervised Visual Pre-TrainingInternational Conference on Learning Representations (ICLR), 2022
Yuxin Fang
Li Dong
Hangbo Bao
Xinggang Wang
Furu Wei
279
92
0
07 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning FrameworkInternational Conference on Machine Learning (ICML), 2022
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLMObjD
446
993
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
Context Autoencoder for Self-Supervised Representation LearningInternational Journal of Computer Vision (IJCV), 2022
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
390
445
0
07 Feb 2022
ShapeFormer: Transformer-based Shape Completion via Sparse
  Representation
ShapeFormer: Transformer-based Shape Completion via Sparse RepresentationComputer Vision and Pattern Recognition (CVPR), 2022
Xingguang Yan
Liqiang Lin
Niloy J. Mitra
Dani Lischinski
Daniel Cohen-Or
Hui Huang
ViT
362
145
0
25 Jan 2022
CM3: A Causal Masked Multimodal Model of the Internet
CM3: A Causal Masked Multimodal Model of the Internet
Armen Aghajanyan
Po-Yao (Bernie) Huang
Candace Ross
Vladimir Karpukhin
Hu Xu
...
Dmytro Okhonko
Mandar Joshi
Gargi Ghosh
M. Lewis
Luke Zettlemoyer
331
169
0
19 Jan 2022
RestoreFormer: High-Quality Blind Face Restoration from Undegraded
  Key-Value Pairs
RestoreFormer: High-Quality Blind Face Restoration from Undegraded Key-Value PairsComputer Vision and Pattern Recognition (CVPR), 2022
Zhouxia Wang
Jiawei Zhang
Runjian Chen
Wenping Wang
Ping Luo
CVBM
268
138
0
17 Jan 2022
Can We Find Neurons that Cause Unrealistic Images in Deep Generative
  Networks?
Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?International Joint Conference on Artificial Intelligence (IJCAI), 2022
Hwanil Choi
Wonjoon Chang
Jaesik Choi
GAN
201
4
0
17 Jan 2022
BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations
BigDatasetGAN: Synthesizing ImageNet with Pixel-wise AnnotationsComputer Vision and Pattern Recognition (CVPR), 2022
Daiqing Li
Huan Ling
Seung Wook Kim
Karsten Kreis
Adela Barriuso
Sanja Fidler
Antonio Torralba
277
128
0
12 Jan 2022
Music2Video: Automatic Generation of Music Video with fusion of audio
  and text
Music2Video: Automatic Generation of Music Video with fusion of audio and text
Yoonjeon Kim
Joel Jang
Sumin Shin
DiffMVGen
176
7
0
11 Jan 2022
Lawin Transformer: Improving Semantic Segmentation Transformer with
  Multi-Scale Representations via Large Window Attention
Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention
Haotian Yan
Chuang Zhang
Ming Wu
ViT
304
75
0
05 Jan 2022
DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from
  Low-Dimensional Latents
DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents
Kushagra Pandey
Avideep Mukherjee
Piyush Rai
Abhishek Kumar
DiffM
475
138
0
02 Jan 2022
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional
  Vision-Language Generation
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation
Han Zhang
Weichong Yin
Yewei Fang
Lanxin Li
Boqiang Duan
Zhihua Wu
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
173
67
0
31 Dec 2021
Learning Spatially-Adaptive Squeeze-Excitation Networks for Image
  Synthesis and Image Recognition
Learning Spatially-Adaptive Squeeze-Excitation Networks for Image Synthesis and Image Recognition
Jianghao Shen
Tianfu Wu
ViT
203
0
0
29 Dec 2021
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI EraIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
546
77
0
27 Dec 2021
StyleSwin: Transformer-based GAN for High-resolution Image Generation
StyleSwin: Transformer-based GAN for High-resolution Image GenerationComputer Vision and Pattern Recognition (CVPR), 2021
Bo Zhang
Shuyang Gu
Bo Zhang
Jianmin Bao
Dong Chen
Fang Wen
Yong Wang
B. Guo
ViT
394
288
0
20 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
1.4K
20,520
0
20 Dec 2021
Solving Inverse Problems with NerfGANs
Solving Inverse Problems with NerfGANs
Giannis Daras
Wenqing Chu
Abhishek Kumar
Dmitry Lagun
A. Dimakis
3DV
158
6
0
16 Dec 2021
Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
Zhisheng Xiao
Karsten Kreis
Arash Vahdat
DiffM
343
660
0
15 Dec 2021
MAGMA -- Multimodal Augmentation of Generative Models through
  Adapter-based Finetuning
MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
C. Eichenberg
Sid Black
Samuel Weinbach
Letitia Parcalabescu
Anette Frank
MLLMVLM
227
108
0
09 Dec 2021
Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Xun Huang
Arun Mallya
Ting-Chun Wang
Xuan Li
DiffM
228
102
0
09 Dec 2021
Text2Mesh: Text-Driven Neural Stylization for Meshes
Text2Mesh: Text-Driven Neural Stylization for MeshesComputer Vision and Pattern Recognition (CVPR), 2021
O. Michel
Roi Bar-On
Richard Liu
Sagie Benaim
Rana Hanocka
CLIPAI4CE
536
413
0
06 Dec 2021
Global Context with Discrete Diffusion in Vector Quantised Modelling for
  Image Generation
Global Context with Discrete Diffusion in Vector Quantised Modelling for Image Generation
Minghui Hu
Yujie Wang
Tat-Jen Cham
Jianfei Yang
P.N.Suganthan
DiffM
131
52
0
03 Dec 2021
Zero-Shot Text-Guided Object Generation with Dream Fields
Zero-Shot Text-Guided Object Generation with Dream Fields
Ajay Jain
B. Mildenhall
Jonathan T. Barron
Pieter Abbeel
Ben Poole
309
627
0
02 Dec 2021
Exploration into Translation-Equivariant Image Quantization
Exploration into Translation-Equivariant Image Quantization
W. Shin
Gyubok Lee
Jiyoung Lee
Eun-Young Lyou
Joonseok Lee
Edward Choi
193
8
0
01 Dec 2021
CLIPstyler: Image Style Transfer with a Single Text Condition
CLIPstyler: Image Style Transfer with a Single Text Condition
Gihyun Kwon
Jong Chul Ye
VLMCLIP
410
314
0
01 Dec 2021
Diffusion Autoencoders: Toward a Meaningful and Decodable Representation
Diffusion Autoencoders: Toward a Meaningful and Decodable Representation
Konpat Preechakul
Nattanat Chatthee
Suttisak Wizadwongsa
Supasorn Suwajanakorn
SyDaDiffM
328
523
0
30 Nov 2021
EdiBERT, a generative model for image editing
EdiBERT, a generative model for image editing
Thibaut Issenhuth
Ugo Tanielian
Jérémie Mary
David Picard
DiffM
275
13
0
30 Nov 2021
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu
Dong Chen
Jianmin Bao
Fang Wen
Bo Zhang
Dongdong Chen
Lu Yuan
B. Guo
DiffM
366
930
0
29 Nov 2021
Blended Diffusion for Text-driven Editing of Natural Images
Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami
Dani Lischinski
Ohad Fried
DiffM
462
1,119
0
29 Nov 2021
SWAT: Spatial Structure Within and Among Tokens
SWAT: Spatial Structure Within and Among TokensInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Kumara Kahatapitiya
Michael S. Ryoo
216
7
0
26 Nov 2021
Scene Representation Transformer: Geometry-Free Novel View Synthesis
  Through Set-Latent Scene Representations
Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene RepresentationsComputer Vision and Pattern Recognition (CVPR), 2021
Mehdi S. M. Sajjadi
H. Meyer
Etienne Pot
Urs M. Bergmann
Klaus Greff
...
Daniel Duckworth
Alexey Dosovitskiy
Jakob Uszkoreit
Thomas Funkhouser
Andrea Tagliasacchi
ViT
349
227
0
25 Nov 2021
Layered Controllable Video Generation
Layered Controllable Video Generation
Jiahui Huang
Yuhe Jin
K. M. Yi
Leonid Sigal
VGen
310
12
0
24 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
316
271
0
24 Nov 2021
Unleashing Transformers: Parallel Token Prediction with Discrete
  Absorbing Diffusion for Fast High-Resolution Image Generation from
  Vector-Quantized Codes
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes
Sam Bond-Taylor
P. Hessey
Hiroshi Sasaki
T. Breckon
Chris G. Willcocks
DiffM
216
85
0
24 Nov 2021
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically
  Structured Sequences
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
Moritz Ibing
Gregor Kobsik
Leif Kobbelt
191
42
0
24 Nov 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViTVGen
251
340
0
24 Nov 2021
Previous
123...45464748
Next