ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.09841
  4. Cited By
Taming Transformers for High-Resolution Image Synthesis

Taming Transformers for High-Resolution Image Synthesis

17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
    ViT
ArXivPDFHTML

Papers citing "Taming Transformers for High-Resolution Image Synthesis"

50 / 476 papers shown
Title
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for
  Interleaved Image-Text Generation
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation
Ethan Chern
Jiadi Su
Yan Ma
Pengfei Liu
MLLM
29
26
0
08 Jul 2024
PerlDiff: Controllable Street View Synthesis Using Perspective-Layout
  Diffusion Models
PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models
Jinhua Zhang
Hualian Sheng
Sijia Cai
Bing Deng
Qiao Liang
Wen Li
Ying Fu
Jieping Ye
Shuhang Gu
DiffM
32
2
0
08 Jul 2024
Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation
Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation
Xiang Gao
Zhengbo Xu
Junhan Zhao
Jiaying Liu
DiffM
27
8
0
03 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Y. Li
Fan Ma
Zongxin Yang
Y. Yang
91
11
0
02 Jul 2024
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Yicheng Chen
Xiangtai Li
Yining Li
Yanhong Zeng
Jianzong Wu
Xiangyu Zhao
Kai Chen
VLM
DiffM
56
3
0
28 Jun 2024
Autoregressive Image Generation without Vector Quantization
Autoregressive Image Generation without Vector Quantization
Tianhong Li
Yonglong Tian
He Li
Mingyang Deng
Kaiming He
DiffM
43
171
0
17 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
44
79
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
60
220
0
10 Jun 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
59
31
0
07 Jun 2024
Bayesian Power Steering: An Effective Approach for Domain Adaptation of
  Diffusion Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
Ding Huang
Ting Li
Jian Huang
DiffM
31
1
0
06 Jun 2024
Learning Image Priors through Patch-based Diffusion Models for Solving
  Inverse Problems
Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems
Jason Hu
Bowen Song
Xiaojian Xu
Liyue Shen
Jeffrey A. Fessler
MedIm
DiffM
31
8
0
04 Jun 2024
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
44
4
0
04 Jun 2024
AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark
AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark
Li Lin
Santosh
Xin Eric Wang
Shu Hu
Shu Hu
EGVM
81
10
0
02 Jun 2024
Learning Discrete Concepts in Latent Hierarchical Models
Learning Discrete Concepts in Latent Hierarchical Models
Lingjing Kong
Guan-Hong Chen
Biwei Huang
Eric P. Xing
Yuejie Chi
Kun Zhang
44
4
0
01 Jun 2024
Promptus: Can Prompts Streaming Replace Video Streaming with Stable
  Diffusion
Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion
Jiangkai Wu
Liming Liu
Yunpeng Tan
Junlin Hao
Xinggong Zhang
30
2
0
30 May 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
35
3
0
28 May 2024
Vista: A Generalizable Driving World Model with High Fidelity and
  Versatile Controllability
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
60
75
0
27 May 2024
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma
Dheeraj M. Nagaraj
Karthikeyan Shanmugam
VLM
62
2
0
27 May 2024
Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
Yeongmin Kim
Kwanghyeon Lee
Minsang Park
Byeonghu Na
Il-Chul Moon
DiffM
42
2
0
27 May 2024
iVideoGPT: Interactive VideoGPTs are Scalable World Models
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Jialong Wu
Shaofeng Yin
Ningya Feng
Xu He
Dong Li
Jianye Hao
Mingsheng Long
VGen
35
23
0
24 May 2024
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
Seyedmorteza Sadat
Jakob Buhmann
Derek Bradley
Otmar Hilliges
Romann M. Weber
39
9
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data
Xinyi Wang
Grazziela Figueredo
Ruizhe Li
W. Zhang
Weitong Chen
Xin Chen
MedIm
ViT
41
2
0
21 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
36
2
0
11 May 2024
Preble: Efficient Distributed Prompt Scheduling for LLM Serving
Preble: Efficient Distributed Prompt Scheduling for LLM Serving
Vikranth Srivatsa
Zijian He
Reyna Abhyankar
Dongming Li
Yiying Zhang
40
17
0
08 May 2024
SATO: Stable Text-to-Motion Framework
SATO: Stable Text-to-Motion Framework
Wenshuo Chen
Hongru Xiao
Erhang Zhang
Lijie Hu
Lei Wang
Mengyuan Liu
C. L. P. Chen
32
4
0
02 May 2024
Towards Real-world Video Face Restoration: A New Benchmark
Towards Real-world Video Face Restoration: A New Benchmark
Ziyan Chen
Jingwen He
Xinqi Lin
Yu Qiao
Chao Dong
40
4
0
30 Apr 2024
TextGaze: Gaze-Controllable Face Generation with Natural Language
TextGaze: Gaze-Controllable Face Generation with Natural Language
Hengfei Wang
Zhongqun Zhang
Yihua Cheng
Hyung Jin Chang
DiffM
33
2
0
26 Apr 2024
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion
  Models
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models
Qinghe Wang
Baolu Li
Xiaomin Li
Bing Cao
Liqian Ma
Huchuan Lu
Xu Jia
DiffM
37
6
0
24 Apr 2024
X-Ray: A Sequential 3D Representation For Generation
X-Ray: A Sequential 3D Representation For Generation
Tao Hu
Wenhang Ge
Yuyang Zhao
Gim Hee Lee
MedIm
24
4
0
22 Apr 2024
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
Haoyu Zheng
Wenqiao Zhang
Yaoke Wang
Hao Zhou
Jiang Liu
Juncheng Li
Zheqi Lv
Siliang Tang
Yueting Zhuang
Yueting Zhuang
32
1
0
21 Apr 2024
MaSkel: A Model for Human Whole-body X-rays Generation from Human
  Masking Images
MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images
Yingjie Xi
Boyuan Cheng
Jingyao Cai
Jian Jun Zhang
Xiaosong Yang
MedIm
33
0
0
13 Apr 2024
E3: Ensemble of Expert Embedders for Adapting Synthetic Image Detectors
  to New Generators Using Limited Data
E3: Ensemble of Expert Embedders for Adapting Synthetic Image Detectors to New Generators Using Limited Data
Aref Azizpour
Tai D. Nguyen
Manil Shrestha
Kaidi Xu
Edward Kim
Matthew C. Stamm
29
4
0
12 Apr 2024
Taming Stable Diffusion for Text to 360° Panorama Image Generation
Taming Stable Diffusion for Text to 360° Panorama Image Generation
Cheng Zhang
Qianyi Wu
Camilo Cruz Gambardella
Xiaoshui Huang
Dinh Q. Phung
Wanli Ouyang
Jianfei Cai
MDE
19
8
0
11 Apr 2024
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
Zewei Zhang
Huan Liu
Jun Chen
Xiangyu Xu
DiffM
24
8
0
10 Apr 2024
Contextual Chart Generation for Cyber Deception
Contextual Chart Generation for Cyber Deception
David D. Nguyen
David Liebowitz
Surya Nepal
S. Kanhere
Sharif Abuadbba
41
0
0
07 Apr 2024
Monocular Identity-Conditioned Facial Reflectance Reconstruction
Monocular Identity-Conditioned Facial Reflectance Reconstruction
Xingyu Ren
Jiankang Deng
Yuhao Cheng
Jia Guo
Chao Ma
Yichao Yan
Wenhan Zhu
Xiaokang Yang
3DH
33
3
0
30 Mar 2024
FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression
FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression
Alireza Furutanpey
Qiyang Zhang
Philipp Raith
Tobias Pfandzelter
Shangguang Wang
Schahram Dustdar
88
4
0
25 Mar 2024
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Roberto Henschel
Levon Khachatryan
Daniil Hayrapetyan
Hayk Poghosyan
Vahram Tadevosyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
VGen
91
77
0
21 Mar 2024
Generative Enhancement for 3D Medical Images
Generative Enhancement for 3D Medical Images
Lingting Zhu
Noel Codella
Dongdong Chen
Zhenchao Jin
Lu Yuan
Lequan Yu
DiffM
MedIm
32
10
0
19 Mar 2024
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video
  Object Segmentation
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Zixin Zhu
Xuelu Feng
Dongdong Chen
Junsong Yuan
Chunming Qiao
Gang Hua
DiffM
29
7
0
18 Mar 2024
HyperVQ: MLR-based Vector Quantization in Hyperbolic Space
HyperVQ: MLR-based Vector Quantization in Hyperbolic Space
Nabarun Goswami
Yusuke Mukuta
Tatsuya Harada
35
3
0
18 Mar 2024
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Haoyang Liu
Aditya Singh
Yijiang Li
Haohan Wang
AAML
ViT
31
1
0
15 Mar 2024
UniCode: Learning a Unified Codebook for Multimodal Large Language
  Models
UniCode: Learning a Unified Codebook for Multimodal Large Language Models
Sipeng Zheng
Bohan Zhou
Yicheng Feng
Ye Wang
Zongqing Lu
VLM
MLLM
31
7
0
14 Mar 2024
PFStorer: Personalized Face Restoration and Super-Resolution
PFStorer: Personalized Face Restoration and Super-Resolution
Tuomas Varanka
Tapani Toivonen
Soumya Tripathy
Guoying Zhao
Erman Acar
DiffM
34
2
0
13 Mar 2024
Synth$^2$: Boosting Visual-Language Models with Synthetic Captions and
  Image Embeddings
Synth2^22: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings
Sahand Sharifzadeh
Christos Kaplanis
Shreya Pathak
D. Kumaran
Anastasija Ilić
Jovana Mitrović
Charles Blundell
Andrea Banino
VLM
32
9
0
12 Mar 2024
Scene Depth Estimation from Traditional Oriental Landscape Paintings
Scene Depth Estimation from Traditional Oriental Landscape Paintings
Sungho Kang
Yeonghyeon Park
H. Park
Juneho Yi
30
0
0
06 Mar 2024
Context-aware Talking Face Video Generation
Context-aware Talking Face Video Generation
Meidai Xuanyuan
Yuwang Wang
Honglei Guo
Qionghai Dai
DiffM
27
0
0
28 Feb 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
Diffusion Model-Based Image Editing: A Survey
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
66
84
0
27 Feb 2024
Previous
12345...8910
Next