Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.09841
Cited By
v1
v2
v3 (latest)
Taming Transformers for High-Resolution Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2020
17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (6185★)
Papers citing
"Taming Transformers for High-Resolution Image Synthesis"
50 / 2,395 papers shown
Title
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Hongxuan Li
Wencheng Zhu
Huiying Xu
Xinzhong Zhu
Q. Hu
MQ
3DPC
405
0
0
15 Nov 2025
MixAR: Mixture Autoregressive Image Generation
Jinyuan Hu
Jiayou Zhang
Shaobo Cui
Kun Zhang
Guangyi Chen
DiffM
132
0
0
15 Nov 2025
FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching
Bernardo Perrone Ribeiro
Jana Faganeli Pucer
190
0
0
12 Nov 2025
Retrospective motion correction in MRI using disentangled embeddings
Qi Wang
Veronika Ecker
Marcel Früh
S. Gatidis
Thomas Kustner
84
0
0
11 Nov 2025
PADM: A Physics-aware Diffusion Model for Attenuation Correction
T. Pham
Hoang Minh Vu
Anh Duc Chu
D. Nguyen
Trung Thanh Nguyen
Thao Nguyen Truong
Mai Hong Son
T. Nguyen
Phi Le Nguyen
MedIm
130
0
0
10 Nov 2025
MRT: Learning Compact Representations with Mixed RWKV-Transformer for Extreme Image Compression
Han Liu
Hengyu Man
Xingtao Wang
Wenrui Li
Debin Zhao
ViT
113
0
0
10 Nov 2025
VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling
Sicheng Yang
Xing Hu
Qiang Wu
Dawei Yang
169
0
0
10 Nov 2025
MALeR: Improving Compositional Fidelity in Layout-Guided Generation
Shivank Saxena
D. Srivastava
Makarand Tapaswi
DiffM
126
0
0
08 Nov 2025
PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
Peiyao Wang
Weining Wang
Qi Li
EGVM
VGen
375
1
0
06 Nov 2025
CPO: Condition Preference Optimization for Controllable Image Generation
Zonglin Lyu
Ming Li
Xinxin Liu
Chen Chen
196
0
0
06 Nov 2025
Effective Test-Time Scaling of Discrete Diffusion through Iterative Refinement
Sanghyun Lee
Sunwoo Kim
Seungryong Kim
Jongho Park
D. Park
72
1
0
04 Nov 2025
DiffSwap++: 3D Latent-Controlled Diffusion for Identity-Preserving Face Swapping
Weston Bondurant
Arkaprava Sinha
Hieu M. Le
Srijan Das
Stephanie Schuckers
DiffM
145
0
0
04 Nov 2025
MoSa: Motion Generation with Scalable Autoregressive Modeling
Mengyuan Liu
Sheng Yan
Y. Wang
Yingjie Li
Gui-Bin Bian
Hong Liu
166
1
0
03 Nov 2025
NSYNC: Negative Synthetic Image Generation for Contrastive Training to Improve Stylized Text-To-Image Translation
Serkan Ozturk
Samet Hicsonmez
Pinar Duygulu
DiffM
333
0
0
03 Nov 2025
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
Xinyan Cai
Shiguang Wu
Dafeng Chi
Yuzheng Zhuang
Xingyue Quan
Jianye Hao
Qiang Guan
81
0
0
03 Nov 2025
Wave-Particle (Continuous-Discrete) Dualistic Visual Tokenization for Unified Understanding and Generation
Yizhu Chen
Chen Ju
Z. Wang
Shuai Xiao
X. Chen
Jinsong Lan
Xiaoyong Zhu
Ying Chen
115
0
0
03 Nov 2025
Continuous Autoregressive Language Models
Chenze Shao
Darren Li
Fandong Meng
Jie Zhou
KELM
286
0
0
31 Oct 2025
MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts
Jingnan Gao
Zhe Wang
X. Fang
X. Ren
Z. Chen
Shengqi Liu
Y. Cheng
Jiangjing Lyu
Xiaokang Yang
Y. Yan
212
1
0
31 Oct 2025
InertialAR: Autoregressive 3D Molecule Generation with Inertial Frames
Haorui Li
Weitao Du
Yuqiang Li
Hongyu Guo
Shengchao Liu
140
1
0
31 Oct 2025
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
Nvidia
Yan Wang
W. Luo
Junjie Bai
Yulong Cao
...
Yurong You
Xiaohui Zeng
Wenyuan Zhang
Boris Ivanovic
Marco Pavone
LRM
132
9
0
30 Oct 2025
BI-DCGAN: A Theoretically Grounded Bayesian Framework for Efficient and Diverse GANs
Mahsa Valizadeh
Rui Tuo
James Caverlee
96
0
0
30 Oct 2025
Emu3.5: Native Multimodal Models are World Learners
Yufeng Cui
Honghao Chen
Haoge Deng
X. Y. Huang
Xinghang Li
...
Zhuo Chen
Yulong Ao
Tiejun Huang
Zhongyuan Wang
Xinlong Wang
MLLM
VGen
420
13
0
30 Oct 2025
Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling
Kyungmin Lee
Sihyun Yu
Jinwoo Shin
AI4CE
230
3
0
28 Oct 2025
Uniform Discrete Diffusion with Metric Path for Video Generation
Haoge Deng
Ting Pan
Fan Zhang
Y. Liu
Zhuoyan Luo
...
Wenxuan Wang
Chunhua Shen
Shiguang Shan
Zhaoxiang Zhang
Xinlong Wang
VGen
146
2
0
28 Oct 2025
Nested AutoRegressive Models
Hongyu Wu
Xuhui Fan
Zhangkai Wu
Longbing Cao
DiffM
103
0
0
27 Oct 2025
Switchable Token-Specific Codebook Quantization For Face Image Compression
Y. Wang
H. Wang
Guodong Mu
Ruixin Zhang
Jiaqi Chen
Jingyun Zhang
Jun Wang
Yuan Xie
Zhizhong Zhang
Shouhong Ding
CVBM
342
0
0
27 Oct 2025
Autoregressive Styled Text Image Generation, but Make it Reliable
Carmine Zaccagnino
Fabio Quattrini
Vittorio Pippi
S. Cascianelli
Alessio Tonioni
Rita Cucchiara
126
0
0
27 Oct 2025
Quantizing Space and Time: Fusing Time Series and Images for Earth Observation
Gianfranco Basile
Johannes Jakubik
Benedikt Blumenstiel
Thomas Brunschwiler
Juan Bernabé-Moreno
199
0
0
27 Oct 2025
Learning Linearity in Audio Consistency Autoencoders via Implicit Regularization
Bernardo Torres
Manuel Moussallam
Gabriel Meseguer-Brocal
193
0
0
27 Oct 2025
Moving Beyond Diffusion: Hierarchy-to-Hierarchy Autoregression for fMRI-to-Image Reconstruction
Xu Zhang
Ruijie Quan
Wenguan Wang
Yi Yang
DiffM
92
0
0
25 Oct 2025
Improved Training Technique for Shortcut Models
Anh-Tien Nguyen
Viet-Anh Nguyen
D. Vu
T. Dao
Chi Tran
Toan M. Tran
Anh Tran
BDL
215
1
0
24 Oct 2025
Morphologically Intelligent Perturbation Prediction with FORM
Reed Naidoo
Matt De Vries
Olga Fourkioti
Vicky Bousgouni
Mar Arias-Garcia
Maria Portillo-Malumbres
Chris Bakal
64
0
0
24 Oct 2025
Pctx: Tokenizing Personalized Context for Generative Recommendation
Qiyong Zhong
Jiajie Su
Yunshan Ma
Julian McAuley
Yupeng Hou
104
0
0
24 Oct 2025
Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation
Enshu Liu
Qian Chen
Xuefei Ning
Shengen Yan
Guohao Dai
Zinan Lin
Yu Wang
DiffM
VLM
138
1
0
23 Oct 2025
GenColorBench: A Color Evaluation Benchmark for Text-to-Image Generation Models
Muhammad Atif Butt
Alexandra Gomez-Villa
Tao Wu
Javier Vázquez-Corral
Joost van de Weijer
Kai Wang
EGVM
VLM
172
0
0
23 Oct 2025
Better Tokens for Better 3D: Advancing Vision-Language Modeling in 3D Medical Imaging
Ibrahim Ethem Hamamci
Sezgin Er
Antonio Terpin
Hadrien Reynaud
Dong Yang
Pengfei Guo
Marc Edgar
Daguang Xu
Bernhard Kainz
Bjoern Menze
MedIm
118
0
0
23 Oct 2025
ARGenSeg: Image Segmentation with Autoregressive Image Generation Model
Xiaolong Wang
Lixiang Ru
Ziyuan Huang
Kaixiang Ji
Dandan Zheng
Jingdong Chen
Jun Zhou
VLM
85
0
0
23 Oct 2025
From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction
Zhida Zhao
Talas Fu
Yifan Wang
Lijun Wang
Huchuan Lu
VGen
186
1
0
22 Oct 2025
GPTFace: Generative Pre-training of Facial-Linguistic Transformer by Span Masking and Weakly Correlated Text-image Data
Yudong Li
Hao Li
Xianxu Hou
Linlin Shen
112
0
0
21 Oct 2025
SSD: Spatial-Semantic Head Decoupling for Efficient Autoregressive Image Generation
Siyong Jian
Huan Wang
DiffM
131
0
0
21 Oct 2025
Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation
Hendric Voss
Lisa Michelle Bohnenkamp
Stefan Kopp
SLR
168
0
0
20 Oct 2025
Generation then Reconstruction: Accelerating Masked Autoregressive Models via Two-Stage Sampling
Feihong Yan
P. Wang
Yao Zhu
Kaiyu Pang
Qingyan Wei
Huiqi Li
Linfeng Zhang
DiffM
122
0
0
20 Oct 2025
Accelerating Vision Transformers with Adaptive Patch Sizes
Rohan Choudhury
JungEun Kim
Jeongseok Lee
Eunho Yang
László A. Jeni
Kishore Venkateshan
ViT
112
1
0
20 Oct 2025
SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization
Wenxi Chen
X. Wang
Ruiqi Yan
Yihao Chen
Zhikang Niu
...
Yuzhe Liang
Hanlin Wen
Shunshun Yin
Ming Tao
Xie Chen
124
1
0
19 Oct 2025
Zero- and One-Shot Data Augmentation for Sentence-Level Dysarthric Speech Recognition in Constrained Scenarios
Shiyao Wang
Shiwan Zhao
Jiaming Zhou
Yong Qin
116
0
0
19 Oct 2025
Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling
Erik Riise
Mehmet Onurcan Kaya
Dim P. Papadopoulos
271
0
0
19 Oct 2025
ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection
Haowei Zhu
Tianxiang Pan
Rui Qin
Jun-Hai Yong
Bin Wang
DiffM
152
0
0
17 Oct 2025
Exploring Conditions for Diffusion models in Robotic Control
Heeseong Shin
Byeongho Heo
Dongyoon Han
Seungryong Kim
Taekyung Kim
192
0
0
17 Oct 2025
Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
Ming Gui
Johannes Schusterbauer
Timy Phan
Felix Krause
J. Susskind
Miguel Angel Bautista
Bjorn Ommer
189
1
0
16 Oct 2025
LightQANet: Quantized and Adaptive Feature Learning for Low-Light Image Enhancement
X. Wu
Zhihui Lai
Xianxu Hou
Jie Zhou
Ya-Nan Zhang
LinLin Shen
96
1
0
16 Oct 2025
Previous
1
2
3
4
5
...
46
47
48
Next