ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.09841
  4. Cited By
Taming Transformers for High-Resolution Image Synthesis
v1v2v3 (latest)

Taming Transformers for High-Resolution Image Synthesis

Computer Vision and Pattern Recognition (CVPR), 2020
17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
    ViT
ArXiv (abs)PDFHTMLGithub (6185★)

Papers citing "Taming Transformers for High-Resolution Image Synthesis"

50 / 2,402 papers shown
Visual Self-Refinement for Autoregressive Models
Visual Self-Refinement for Autoregressive Models
Jiamian Wang
Ziqi Zhou
Chaithanya Kumar Mummadi
S. Dianat
Majid Rabbani
Raghuveer Rao
Chen Qiu
Zhiqiang Tao
105
0
0
01 Oct 2025
BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration
BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration
Zhaoyang Li
Dongjun Qian
Kai Su
Qishuai Diao
Xiangyang Xia
Chang Liu
Wenfei Yang
Tianzhu Zhang
Zehuan Yuan
DiffMVGen
136
2
0
01 Oct 2025
Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Răzvan-Andrei Matişan
Vincent Tao Hu
Grigory Bartosh
Bjorn Ommer
Cees G. M. Snoek
Max Welling
Jan-Willem van de Meent
Mohammad Mahdi Derakhshani
Floor Eijkelboom
140
1
0
01 Oct 2025
Ultra-Efficient Decoding for End-to-End Neural Compression and Reconstruction
Ultra-Efficient Decoding for End-to-End Neural Compression and Reconstruction
Ethan G Rogers
Cheng Wang
131
0
0
01 Oct 2025
PUREVQ-GAN: Defending Data Poisoning Attacks through Vector-Quantized Bottlenecks
PUREVQ-GAN: Defending Data Poisoning Attacks through Vector-Quantized Bottlenecks
Alexander Branch
Omead Brandon Pooladzandi
Radin Khosraviani
Sunay Bhat
Jeffrey Q. Jiang
Gregory Pottie
73
0
0
30 Sep 2025
Flow Autoencoders are Effective Protein Tokenizers
Flow Autoencoders are Effective Protein Tokenizers
Rohit Dilip
Evan Zhang
Ayush Varshney
David Van Valen
DiffM
124
0
0
30 Sep 2025
EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model
EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model
Ruixiao Dong
Z. Wang
Keli Liu
Li Li
Ying Chen
Kai Li
Daowen Li
Houqiang Li
DiffMVGen
142
0
0
30 Sep 2025
DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
Mohammad Hassan Vali
Tom Bäckström
Arno Solin
MQ
141
0
0
30 Sep 2025
Go with Your Gut: Scaling Confidence for Autoregressive Image Generation
Go with Your Gut: Scaling Confidence for Autoregressive Image Generation
Harold Haodong Chen
Xianfeng Wu
Wen-Jie Shu
Rongjin Guo
Disen Lan
Harry Yang
Ying-Cong Chen
136
1
0
30 Sep 2025
Real-Aware Residual Model Merging for Deepfake Detection
Real-Aware Residual Model Merging for Deepfake Detection
Jinhee Park
Guisik Kim
Choongsang Cho
Junseok Kwon
MoMe
151
0
0
29 Sep 2025
ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation
ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation
Jiuhong Xiao
Roshan Nayak
Ning Zhang
Daniel Tortei
Giuseppe Loianno
DiffM
207
0
0
29 Sep 2025
Tumor Synthesis conditioned on Radiomics
Tumor Synthesis conditioned on RadiomicsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Jonghun Kim
Inye Na
Eun Sook Ko
Hyunjin Park
MedIm
214
2
0
29 Sep 2025
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
Jitai Hao
Hao Liu
Xinyan Xiao
Qiang Huang
Jun Yu
223
0
0
29 Sep 2025
Understanding Generative Recommendation with Semantic IDs from a Model-scaling View
Understanding Generative Recommendation with Semantic IDs from a Model-scaling View
Jingzhe Liu
Liam Collins
Shucheng Zhou
Tong Zhao
Neil Shah
Clark Mingxuan Ju
VLMLRM
193
1
0
29 Sep 2025
Score-based Membership Inference on Diffusion Models
Score-based Membership Inference on Diffusion Models
Mingxing Rao
Bowen Qu
Daniel Moyer
DiffM
130
1
0
29 Sep 2025
STAGE: Stable and Generalizable GRPO for Autoregressive Image Generation
STAGE: Stable and Generalizable GRPO for Autoregressive Image Generation
Xiaoxiao Ma
Haibo Qiu
Guohui Zhang
Zhixiong Zeng
Siqi Yang
Lin Ma
Feng Zhao
122
4
0
29 Sep 2025
Scalable GANs with Transformers
Scalable GANs with Transformers
Sangeek Hyun
MinKyu Lee
Jae-Pil Heo
111
1
0
29 Sep 2025
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation
Guolin Ke
Hui Xue
137
3
0
29 Sep 2025
Environment-Aware Satellite Image Generation with Diffusion Models
Environment-Aware Satellite Image Generation with Diffusion Models
Nikos Kostagiolas
Pantelis Georgiades
Yannis Panagakis
M. Nicolaou
105
0
0
29 Sep 2025
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention
Kai Li
Kejun Gao
Xiaolin Hu
81
0
0
28 Sep 2025
HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation
HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation
Cong Chen
Ziyuan Huang
Cheng Zou
Huanyi Zheng
Kaixiang Ji
Jiajia Liu
Jingdong Chen
Hao Chen
Chunhua Shen
154
3
0
28 Sep 2025
Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution
Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution
Qifan Li
Jiale Zou
Jinhua Zhang
Wei Long
Xingyu Zhou
Shuhang Gu
SupRMQ
293
0
0
28 Sep 2025
Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models
Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models
Ky Dan Nguyen
Hoang Lam Tran
Anh-Dung Dinh
Daochang Liu
Weidong Cai
Xiuying Wang
Chang Xu
218
0
0
28 Sep 2025
Entering the Era of Discrete Diffusion Models: A Benchmark for Schrödinger Bridges and Entropic Optimal Transport
Entering the Era of Discrete Diffusion Models: A Benchmark for Schrödinger Bridges and Entropic Optimal Transport
Xavier Aramayo Carrasco
Grigoriy Ksenofontov
Aleksei Leonov
Iaroslav Koshelev
Alexander Korotin
OT
217
0
0
27 Sep 2025
Stochastic Interpolants via Conditional Dependent Coupling
Stochastic Interpolants via Conditional Dependent Coupling
Chenrui Ma
Xi Xiao
Tianyang Wang
Xiao Wang
Yanning Shen
DiffM
152
3
0
27 Sep 2025
ARSS: Taming Decoder-only Autoregressive Visual Generation for View Synthesis From Single View
ARSS: Taming Decoder-only Autoregressive Visual Generation for View Synthesis From Single View
Wenbin Teng
Gonglin Chen
Haiwei Chen
Yajie Zhao
DiffMVGen
162
0
0
27 Sep 2025
Object-AVEdit: An Object-level Audio-Visual Editing Model
Object-AVEdit: An Object-level Audio-Visual Editing Model
Y. Fu
Ruiyang Si
Hongfa Wang
Dongzhan Zhou
J. Sun
Ping Luo
Di Hu
Hongyuan Zhang
Xuelong Li
DiffMVGenKELM
191
6
0
27 Sep 2025
Group Critical-token Policy Optimization for Autoregressive Image Generation
Group Critical-token Policy Optimization for Autoregressive Image Generation
Guohui Zhang
Hu Yu
Xiaoxiao Ma
Jinghao Zhang
Yaning Pan
Mingde Yao
Jie Xiao
Linjiang Huang
Feng Zhao
153
2
0
26 Sep 2025
AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook
AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook
Yihao Chen
Kai Hu
Long Zhou
Shulin Feng
Xusheng Yang
Hangting Chen
Xie Chen
162
2
0
26 Sep 2025
PANICL: Mitigating Over-Reliance on Single Prompt in Visual In-Context Learning
PANICL: Mitigating Over-Reliance on Single Prompt in Visual In-Context Learning
Jiahao Zhang
Bowen Wang
Hong Liu
Yuta Nakashima
Hajime Nagahara
MLLMVLM
194
1
0
26 Sep 2025
Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization
Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization
Takashi Morita
MQ
178
0
0
26 Sep 2025
Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings
Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings
Yuanzhi Zhu
Xi Wang
Stéphane Lathuilière
Vicky Kalogeiton
136
2
0
26 Sep 2025
The Unanticipated Asymmetry Between Perceptual Optimization and Assessment
The Unanticipated Asymmetry Between Perceptual Optimization and Assessment
Jiabei Zhang
Qi Wang
Siyu Wu
Du Chen
Tianhe Wu
AAML
146
0
0
25 Sep 2025
FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets
FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets
Kairui Fu
Tao Zhang
Shuwen Xiao
Ziyang Wang
X. Zhang
...
Xiangheng Kong
Shengyu Zhang
Kun Kuang
Yuning Jiang
Bo Zheng
201
1
0
25 Sep 2025
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
Teng Xiao
Zuchao Li
Lefei Zhang
182
1
0
23 Sep 2025
One-shot Embroidery Customization via Contrastive LoRA Modulation
One-shot Embroidery Customization via Contrastive LoRA Modulation
Jun Ma
Qian He
Gaofeng He
Huang Chen
Chen Liu
Xiaogang Jin
Huamin Wang
DiffM
194
0
0
23 Sep 2025
Adversarially-Refined VQ-GAN with Dense Motion Tokenization for Spatio-Temporal Heatmaps
Adversarially-Refined VQ-GAN with Dense Motion Tokenization for Spatio-Temporal Heatmaps
Gabriel Maldonado
Narges Rashvand
Armin Danesh Pazho
Ghazal Alinezhad Noghre
Vinit Katariya
Hamed Tabkhi
132
0
0
23 Sep 2025
Codebook-Based Adaptive Feature Compression With Semantic Enhancement for Edge-Cloud Systems
Codebook-Based Adaptive Feature Compression With Semantic Enhancement for Edge-Cloud Systems
Xinyu Wang
Zikun Zhou
Y. Li
Xin An
Hongpeng Wang
134
0
0
23 Sep 2025
DiSSECT: Structuring Transfer-Ready Medical Image Representations through Discrete Self-Supervision
DiSSECT: Structuring Transfer-Ready Medical Image Representations through Discrete Self-Supervision
Azad Singh
Deepak Mishra
132
0
0
23 Sep 2025
Learning Dexterous Manipulation with Quantized Hand State
Learning Dexterous Manipulation with Quantized Hand State
Ying Feng
Hongjie Fang
Yinong He
Jingjing Chen
Chenxi Wang
Zihao He
Ruonan Liu
Cewu Lu
139
0
0
22 Sep 2025
VCE: Safe Autoregressive Image Generation via Visual Contrast Exploitation
VCE: Safe Autoregressive Image Generation via Visual Contrast Exploitation
Feng Han
Chao Gong
Zhipeng Wei
Yue Yu
Yu Jiang
DiffM
183
0
0
21 Sep 2025
Efficient Rectified Flow for Image Fusion
Efficient Rectified Flow for Image Fusion
Zirui Wang
Jiayi Zhang
Tianwei Guan
Yuhan Zhou
Xingyuan Li
Minjing Dong
Jinyuan Liu
292
2
0
20 Sep 2025
AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models
AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models
Vatsal Malaviya
Agneet Chatterjee
Maitreya Patel
Yezhou Yang
Chitta Baral
102
0
0
19 Sep 2025
SAMPO:Scale-wise Autoregression with Motion PrOmpt for generative world models
SAMPO:Scale-wise Autoregression with Motion PrOmpt for generative world models
Sen Wang
Jingyi Tian
Le Wang
Zhimin Liao
Jiayi Li
Huaiyi Dong
Kun Xia
Sanping Zhou
Wei Tang
Hua Gang
VGenLRM
175
0
0
19 Sep 2025
Deep Learning Empowered Super-Resolution: A Comprehensive Survey and Future Prospects
Deep Learning Empowered Super-Resolution: A Comprehensive Survey and Future ProspectsProceedings of the IEEE (Proc. IEEE), 2025
Le Zhang
Ao Li
Qibin Hou
Ce Zhu
Yonina C. Eldar
SupR
285
1
0
19 Sep 2025
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
Xiaoyu Yue
Zidong Wang
Yuqing Wang
Wenlong Zhang
Xihui Liu
Wanli Ouyang
Wenlong Zhang
Luping Zhou
GAN
255
2
0
18 Sep 2025
PRISM: Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images
PRISM: Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images
Emanuele Ricco
Elia Onofri
Lorenzo Cima
S. Cresci
Roberto Di Pietro
108
0
0
18 Sep 2025
OpenViGA: Video Generation for Automotive Driving Scenes by Streamlining and Fine-Tuning Open Source Models with Public Data
OpenViGA: Video Generation for Automotive Driving Scenes by Streamlining and Fine-Tuning Open Source Models with Public Data
Björn Möller
Zhengyang Li
Malte Stelzer
Thomas Graave
Fabian Bettels
Muaaz Ataya
Tim Fingscheidt
VGen
160
0
0
18 Sep 2025
AToken: A Unified Tokenizer for Vision
AToken: A Unified Tokenizer for Vision
Jiasen Lu
Liangchen Song
Mingze Xu
Byeongjoo Ahn
Yanjun Wang
Chen Chen
Afshin Dehghan
Yinfei Yang
ViT
243
7
0
17 Sep 2025
Towards a Physics Foundation Model
Towards a Physics Foundation Model
Florian Wiesner
Matthias Wessling
Stephen Baek
AI4CEPINN
220
3
0
17 Sep 2025
Previous
12345...474849
Next