v1v2v3 (latest)

Taming Transformers for High-Resolution Image Synthesis

Computer Vision and Pattern Recognition (CVPR), 2020

17 December 2020

ArXiv (abs)PDF HTML Github (6185★)

Papers citing "Taming Transformers for High-Resolution Image Synthesis"

50 / 2,402 papers shown

Visual Self-Refinement for Autoregressive Models

Jiamian Wang

Ziqi Zhou

Chaithanya Kumar Mummadi

105

01 Oct 2025

BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration

136

01 Oct 2025

Purrception: Variational Flow Matching for Vector-Quantized Image Generation

Răzvan-Andrei Matişan

Jan-Willem van de Meent

Mohammad Mahdi Derakhshani

Floor Eijkelboom

140

01 Oct 2025

Ultra-Efficient Decoding for End-to-End Neural Compression and Reconstruction

Ethan G Rogers

Cheng Wang

131

01 Oct 2025

PUREVQ-GAN: Defending Data Poisoning Attacks through Vector-Quantized Bottlenecks

Alexander Branch

Omead Brandon Pooladzandi

30 Sep 2025

Flow Autoencoders are Effective Protein Tokenizers

124

30 Sep 2025

EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model

142

30 Sep 2025

DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick

Mohammad Hassan Vali

Tom Bäckström

Arno Solin

141

30 Sep 2025

Go with Your Gut: Scaling Confidence for Autoregressive Image Generation

136

30 Sep 2025

Real-Aware Residual Model Merging for Deepfake Detection

151

29 Sep 2025

ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation

207

29 Sep 2025

Tumor Synthesis conditioned on RadiomicsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

214

29 Sep 2025

Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models

223

29 Sep 2025

Understanding Generative Recommendation with Semantic IDs from a Model-scaling View

193

29 Sep 2025

Score-based Membership Inference on Diffusion Models

130

29 Sep 2025

STAGE: Stable and Generalizable GRPO for Autoregressive Image Generation

122

29 Sep 2025

Scalable GANs with Transformers

Sangeek Hyun

MinKyu Lee

Jae-Pil Heo

111

29 Sep 2025

Hyperspherical Latents Improve Continuous-Token Autoregressive Generation

Guolin Ke

Hui Xue

137

29 Sep 2025

Environment-Aware Satellite Image Generation with Diffusion Models

105

29 Sep 2025

Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention

Kai Li

Kejun Gao

Xiaolin Hu

28 Sep 2025

HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation

154

28 Sep 2025

Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution

293

28 Sep 2025

Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models

218

28 Sep 2025

Entering the Era of Discrete Diffusion Models: A Benchmark for Schrödinger Bridges and Entropic Optimal Transport

Xavier Aramayo Carrasco

217

27 Sep 2025

Stochastic Interpolants via Conditional Dependent Coupling

152

27 Sep 2025

ARSS: Taming Decoder-only Autoregressive Visual Generation for View Synthesis From Single View

162

27 Sep 2025

Object-AVEdit: An Object-level Audio-Visual Editing Model

191

27 Sep 2025

Group Critical-token Policy Optimization for Autoregressive Image Generation

153

26 Sep 2025

AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook

162

26 Sep 2025

PANICL: Mitigating Over-Reliance on Single Prompt in Visual In-Context Learning

194

26 Sep 2025

Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization

Takashi Morita

178

26 Sep 2025

Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings

136

26 Sep 2025

The Unanticipated Asymmetry Between Perceptual Optimization and Assessment

146

25 Sep 2025

FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets

...

201

25 Sep 2025

OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment

Teng Xiao

Zuchao Li

Lefei Zhang

182

23 Sep 2025

One-shot Embroidery Customization via Contrastive LoRA Modulation

194

23 Sep 2025

Adversarially-Refined VQ-GAN with Dense Motion Tokenization for Spatio-Temporal Heatmaps

Gabriel Maldonado

Narges Rashvand

Armin Danesh Pazho

Ghazal Alinezhad Noghre

Vinit Katariya

Hamed Tabkhi

132

23 Sep 2025

Codebook-Based Adaptive Feature Compression With Semantic Enhancement for Edge-Cloud Systems

134

23 Sep 2025

DiSSECT: Structuring Transfer-Ready Medical Image Representations through Discrete Self-Supervision

Azad Singh

Deepak Mishra

132

23 Sep 2025

Learning Dexterous Manipulation with Quantized Hand State

139

22 Sep 2025

VCE: Safe Autoregressive Image Generation via Visual Contrast Exploitation

183

21 Sep 2025

Efficient Rectified Flow for Image Fusion

292

20 Sep 2025

AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models

102

19 Sep 2025

SAMPO:Scale-wise Autoregression with Motion PrOmpt for generative world models

175

19 Sep 2025

Deep Learning Empowered Super-Resolution: A Comprehensive Survey and Future ProspectsProceedings of the IEEE (Proc. IEEE), 2025

285

19 Sep 2025

Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation

255

18 Sep 2025

PRISM: Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images

108

18 Sep 2025

OpenViGA: Video Generation for Automotive Driving Scenes by Streamlining and Fine-Tuning Open Source Models with Public Data

160

18 Sep 2025

AToken: A Unified Tokenizer for Vision

243

17 Sep 2025

Towards a Physics Foundation Model

220

17 Sep 2025