Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.09841
Cited By
v1
v2
v3 (latest)
Taming Transformers for High-Resolution Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2020
17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (6185★)
Papers citing
"Taming Transformers for High-Resolution Image Synthesis"
50 / 2,402 papers shown
RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space
Jingyun Liang
Jingkai Zhou
Shikai Li
Chenjie Cao
Lei Sun
Yichen Qian
Weihua Chen
Fan Wang
DiffM
VGen
114
3
0
12 Aug 2025
Enhanced Generative Structure Prior for Chinese Text Image Super-resolution
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Xiaoming Li
Wangmeng Zuo
Chen Change Loy
175
0
0
11 Aug 2025
AR-GRPO: Training Autoregressive Image Generation Models via Reinforcement Learning
Shihao Yuan
Yahui Liu
Yang Yue
Jingyuan Zhang
Wangmeng Zuo
Qi Wang
Fuzheng Zhang
Guorui Zhou
EGVM
VLM
145
11
0
09 Aug 2025
NEP: Autoregressive Image Editing via Next Editing Token Prediction
Huimin Wu
Xiaojian Ma
Haozhe Zhao
Yanpeng Zhao
Qing Li
DiffM
144
2
0
08 Aug 2025
WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction
Shaobin Zhuang
Yiwei Guo
Canmiao Fu
Z. Huang
Zeyue Tian
Ying Zhang
Ying Zhang
Chen Li
Yali Wang
ViT
219
2
0
07 Aug 2025
Deeper Inside Deep ViT
Sungrae Hong
151
0
0
06 Aug 2025
UniEdit-I: Training-free Image Editing for Unified VLM via Iterative Understanding, Editing and Verifying
Chengyu Bai
Jintao Chen
Xiang Bai
Yilong Chen
Qi She
Ming Lu
Shanghang Zhang
180
0
0
05 Aug 2025
HPSv3: Towards Wide-Spectrum Human Preference Score
Yuhang Ma
Xiaoshi Wu
Keqiang Sun
K. Sun
Jiaming Song
149
51
0
05 Aug 2025
CIVQLLIE: Causal Intervention with Vector Quantization for Low-Light Image Enhancement
Tongshun Zhang
Pingping Liu
Zhe Zhang
Qiuzhan Zhou
116
0
0
05 Aug 2025
Cross-Domain Image Synthesis: Generating H&E from Multiplex Biomarker Imaging
Jillur Rahman Saurav
M. Nasr
Jacob M. Luber
MedIm
104
0
0
05 Aug 2025
GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Yifei Sun
Zhanghao Chen
Hao Zheng
Yuqing Lu
Lixin Duan
Fenglei Fan
Ahmed Elazab
Xiang Wan
Changmiao Wang
Ruiquan Ge
MedIm
88
1
0
05 Aug 2025
Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation
P. Wang
Yi Peng
Yimeng Gan
Liang Hu
Tianyidan Xie
...
Hongyang Wei
Eric Li
Xuchen Song
Yang Liu
Yahui Zhou
SyDa
132
9
0
05 Aug 2025
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
Qianli Ma
Yaowei Zheng
Zhelun Shi
Zhongkai Zhao
Bin Jia
...
Y. Li
Jiacheng Yang
Yanghua Peng
Zhi-Li Zhang
Xin Liu
MoE
VLM
350
3
0
04 Aug 2025
PESTO: Real-Time Pitch Estimation with Self-supervised Transposition-equivariant Objective
Transactions of the International Society for Music Information Retrieval (TISMIR), 2025
Alain Riou
Bernardo Torres
Ben Hayes
Stefan Lattner
Gaëtan Hadjeres
Gaël Richard
Geoffroy Peeters
264
3
0
02 Aug 2025
StorySync: Training-Free Subject Consistency in Text-to-Image Generation via Region Harmonization
Gopalji Gaur
Mohammadreza Zolfaghari
Thomas Brox
DiffM
159
0
0
31 Jul 2025
X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention
International Conference on Learning Representations (ICLR), 2025
Xiaochen Zhao
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiu Li
Linjie Luo
J. Suo
Yebin Liu
VGen
174
17
0
30 Jul 2025
Subtyping Breast Lesions via Generative Augmentation based Long-tailed Recognition in Ultrasound
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Shijing Chen
Xinrui Zhou
Yuhao Wang
Yuhao Huang
Ao Chang
Dong Ni
Ruobing Huang
MedIm
125
0
0
30 Jul 2025
Bi-Level Optimization for Self-Supervised AI-Generated Face Detection
Mian Zou
Nan Zhong
Baosheng Yu
Yibing Zhan
Kede Ma
CVBM
148
1
0
30 Jul 2025
Generative Recommendation with Semantic IDs: A Practitioner's Handbook
Clark Mingxuan Ju
Liam Collins
Leonardo Neves
Bhuvesh Kumar
Louis Yufeng Wang
Tong Zhao
Neil Shah
VLM
120
3
0
29 Jul 2025
HDR Environment Map Estimation with Latent Diffusion Models
Jack Hilliard
Adrian Hilton
Jean-Yves Guillemaut
DiffM
142
0
0
28 Jul 2025
Kernel Learning for Sample Constrained Black-Box Optimization
AAAI Conference on Artificial Intelligence (AAAI), 2025
Rajalaxmi Rajagopalan
Yu-Lin Wei
Romit Roy Choudhury
GP
136
0
0
28 Jul 2025
Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis
Zhuokun Chen
Jugang Fan
Zhuowei Yu
Bohan Zhuang
Zhuliang Yu
DiffM
147
4
0
28 Jul 2025
MagicAnime: A Hierarchically Annotated, Multimodal and Multitasking Dataset with Benchmarks for Cartoon Animation Generation
Shuolin Xu
Bingyuan Wang
Zeyu Cai
Fangteng Fu
Yue Ma
Tongyi Lee
Hongchuan Yu
Zeyu Wang
VGen
171
1
0
27 Jul 2025
Local Prompt Adaptation for Style-Consistent Multi-Object Generation in Diffusion Models
Ankit Sanjyal
DiffM
259
0
0
27 Jul 2025
RARE: Refine Any Registration of Pairwise Point Clouds via Zero-Shot Learning
Chengyu Zheng
Jin Huang
Honghua Chen
Mingqiang Wei
DiffM
169
1
0
26 Jul 2025
A Survey on Generative Model Unlearning: Fundamentals, Taxonomy, Evaluation, and Future Direction
Xiaohua Feng
Jiaming Zhang
Fengyuan Yu
C. Wang
Li Zhang
Kaixiang Li
Yuyuan Li
Chaochao Chen
Jianwei Yin
MU
262
2
0
26 Jul 2025
SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models
AAAI Conference on Artificial Intelligence (AAAI), 2025
J. Park
Kumju Jo
Sungyong Baik
DiffM
201
0
0
26 Jul 2025
SCALAR: Scale-wise Controllable Visual Autoregressive Learning
Ryan Xu
Dongyang Jin
Y. Bai
Rui Lan
Xu Duan
Lei Sun
Xiangxiang Chu
298
8
0
26 Jul 2025
KB-DMGen: Knowledge-Based Global Guidance and Dynamic Pose Masking for Human Image Generation
Shibang Liu
Xuemei Xie
G. Shi
DiffM
233
0
0
26 Jul 2025
Reconstruct or Generate: Exploring the Spectrum of Generative Modeling for Cardiac MRI
Niklas Bubeck
Yundi Zhang
Antonio Terpin
Daniel Rueckert
Jiazhen Pan
DiffM
MedIm
167
1
0
25 Jul 2025
A Survey of Multimodal Hallucination Evaluation and Detection
Zhiyuan Chen
Yuecong Min
Jie M. Zhang
Bei Yan
Jiahao Wang
X. Wang
Shiguang Shan
HILM
352
5
0
25 Jul 2025
Even Faster Simulations with Flow Matching: A Study of Zero Degree Calorimeter Responses
Maksymilian Wojnar
AI4CE
118
1
0
24 Jul 2025
Improving Large Vision-Language Models' Understanding for Field Data
Xiaomei Zhang
Hanyu Zheng
Xiangyu Zhu
Jinghuan Wei
Junhong Zou
Zhen Lei
Zhaoxiang Zhang
VLM
152
0
0
24 Jul 2025
Vec2Face+ for Face Dataset Generation
Haiyu Wu
Jaskirat Singh
Sicong Tian
Liang Zheng
Kevin W. Bowyer
212
0
0
23 Jul 2025
Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Yi Xin
Juncheng Yan
Qi Qin
Ge Wang
Dongyang Liu
...
Jiaming Song
Guangtao Zhai
Xiaohong Liu
Botian Shi
Peng Gao
208
24
0
23 Jul 2025
HarmonPaint: Harmonized Training-Free Diffusion Inpainting
Ying Li
Xinzhe Li
Yong Du
Yangyang Xu
Junyu Dong
Shengfeng He
DiffM
172
0
0
22 Jul 2025
Scale Your Instructions: Enhance the Instruction-Following Fidelity of Unified Image Generation Model by Self-Adaptive Attention Scaling
Chao Zhou
Tianyi Wei
Nenghai Yu
DiffM
153
2
0
22 Jul 2025
Latent Denoising Makes Good Visual Tokenizers
Jiawei Yang
Tianhong Li
Lijie Fan
Yonglong Tian
Yue Wang
192
13
0
21 Jul 2025
A Practical Investigation of Spatially-Controlled Image Generation with Transformers
Guoxuan Xia
Harleen Hanspal
Petru-Daniel Tudosiu
Shifeng Zhang
Sarah Parisot
210
0
0
21 Jul 2025
ReDi: Rectified Discrete Flow
Jaehoon Yoo
Wonjung Kim
Seunghoon Hong
199
3
0
21 Jul 2025
Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey
Jindong Li
Yali Fu
Jiahong Liu
Linxiao Cao
Wei Ji
Menglin Yang
Irwin King
Ming-Hsuan Yang
OffRL
156
3
0
21 Jul 2025
Quantizing Text-attributed Graphs for Semantic-Structural Integration
Knowledge Discovery and Data Mining (KDD), 2025
Jianyuan Bo
Hao Wu
Yuan Fang
296
2
0
20 Jul 2025
Aesthetics is Cheap, Show me the Text: An Empirical Evaluation of State-of-the-Art Generative Models for OCR
Peirong Zhang
Haowei Xu
Jiaxin Zhang
Guitao Xu
Xuhan Zheng
Zhenhua Yang
Junle Liu
Yuyi Zhang
Lianwen Jin
EGVM
296
2
0
20 Jul 2025
Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey
Jiahui Zhang
Yuelei Li
Anpei Chen
Muyu Xu
Kunhao Liu
...
Hanspeter Pfister
Paul Liang
Shijian Lu
Fangneng Zhan
Fangneng Zhan
638
8
0
19 Jul 2025
DynFaceRestore: Balancing Fidelity and Quality in Diffusion-Guided Blind Face Restoration with Dynamic Blur-Level Mapping and Guidance
Huu-Phu Do
Yu-Wei Chen
Yi-Cheng Liao
Chi-Wei Hsiao
Han-Yang Wang
Wei-Chen Chiu
Ching-Chun Huang
DiffM
268
0
0
18 Jul 2025
Implementing Adaptations for Vision AutoRegressive Model
Kaif Shaikh
Franziska Boenisch
Adam Dziedzic
212
0
0
15 Jul 2025
Latent Diffusion Models with Masked AutoEncoders
Junho Lee
Jeongwoo Shin
Hyungwook Choi
Joonseok Lee
DiffM
203
4
0
14 Jul 2025
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation
Anlin Zheng
Xin Wen
Xuanyang Zhang
Chuofan Ma
Tiancai Wang
Gang Yu
Xiangyu Zhang
Xiaojuan Qi
VLM
191
4
0
11 Jul 2025
Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration
Yuyang Hu
Kangfu Mei
Mojtaba Sahraee-Ardakan
Ulugbek S. Kamilov
P. Milanfar
M. Delbracio
DiffM
306
3
0
08 Jul 2025
ICAS: Detecting Training Data from Autoregressive Image Generative Models
Hongyao Yu
Yixiang Qiu
Y. Yang
Hao Fang
Tianqu Zhuang
Jiaxin Hong
Bin Chen
Hao Wu
Shu-Tao Xia
135
5
0
07 Jul 2025
Previous
1
2
3
...
5
6
7
...
47
48
49
Next