ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.09841
  4. Cited By
Taming Transformers for High-Resolution Image Synthesis
v1v2v3 (latest)

Taming Transformers for High-Resolution Image Synthesis

Computer Vision and Pattern Recognition (CVPR), 2020
17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
    ViT
ArXiv (abs)PDFHTMLGithub (6185★)

Papers citing "Taming Transformers for High-Resolution Image Synthesis"

50 / 2,374 papers shown
Title
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Zhiyang Xu
Jiuhai Chen
Zhaojiang Lin
Xichen Pan
Lifu Huang
...
Di Jin
Michihiro Yasunaga
Lili Yu
Xi Lin
Shaoliang Nie
286
4
0
12 Jun 2025
SpectralAR: Spectral Autoregressive Visual Generation
SpectralAR: Spectral Autoregressive Visual Generation
Yuanhui Huang
Weiliang Chen
Wenzhao Zheng
Yueqi Duan
Jie Zhou
Jiwen Lu
DiffMVGen
248
3
0
12 Jun 2025
SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing
SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing
Hongguang Zhu
Y. X. Wei
Mengyu Wang
Siyu Jiao
Yan Fang
Jiannan Huang
Yao Zhao
193
0
0
11 Jun 2025
DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning
DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning
Dongxu Liu
Yuang Peng
Haomiao Tang
Yuwei Chen
Chunrui Han
Zheng Ge
Daxin Jiang
Mingxue Liao
DiffM
206
1
0
11 Jun 2025
ScoreMix: Synthetic Data Generation by Score Composition in Diffusion Models Improves Recognition
ScoreMix: Synthetic Data Generation by Score Composition in Diffusion Models Improves Recognition
Parsa Rahimi
S´ebastien Marcel
DiffM
194
1
0
11 Jun 2025
Vision Generalist Model: A Survey
Vision Generalist Model: A SurveyInternational Journal of Computer Vision (IJCV), 2025
Ziyi Wang
Yongming Rao
Shuofeng Sun
Xinrun Liu
Yi Wei
...
Zuyan Liu
Yanbo Wang
Hongmin Liu
Jie Zhou
Jiwen Lu
249
0
0
11 Jun 2025
Revisiting Diffusion Models: From Generative Pre-training to One-Step Generation
Bowen Zheng
Tianming Yang
VLM
187
2
0
11 Jun 2025
From Pixels to Graphs: using Scene and Knowledge Graphs for HD-EPIC VQA Challenge
Agnese Taluzzi
Davide Gesualdi
Riccardo Santambrogio
Chiara Plizzari
Francesca Palermo
S. Mentasti
Matteo Matteucci
GNN
218
2
0
10 Jun 2025
Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better
Dianyi Wang
Wei Song
Yikun Wang
Siyuan Wang
Kaicheng Yu
Zhongyu Wei
Jiaqi Wang
178
3
0
10 Jun 2025
Revolutionizing Clinical Trials: A Manifesto for AI-Driven Transformation
Revolutionizing Clinical Trials: A Manifesto for AI-Driven Transformation
M. Schaar
Richard W. Peck
E. McKinney
Jim Weatherall
Stuart Bailey
...
Rafik Salama
Christina Gunther
Francesca Frau
Antoine Pugeat
Ramon Hernandez
MedIm
208
0
0
10 Jun 2025
SUDER: Self-Improving Unified Large Multimodal Models for Understanding and Generation with Dual Self-Rewards
SUDER: Self-Improving Unified Large Multimodal Models for Understanding and Generation with Dual Self-Rewards
Jixiang Hong
Yiran Zhang
Guanzhong Wang
Yi Liu
Ji-Rong Wen
Rui Yan
LRM
154
0
0
09 Jun 2025
VIVAT: Virtuous Improving VAE Training through Artifact Mitigation
VIVAT: Virtuous Improving VAE Training through Artifact Mitigation
Lev Novitskiy
Viacheslav Vasilev
Maria Kovaleva
V. Arkhipkin
Denis Dimitrov
VGen
104
1
0
09 Jun 2025
Highly Compressed Tokenizer Can Generate Without Training
Lukas Lao Beyer
T. Li
X. Chen
S. Karaman
K. He
DiffMVLM
121
3
0
09 Jun 2025
LeVo: High-Quality Song Generation with Multi-Preference Alignment
LeVo: High-Quality Song Generation with Multi-Preference Alignment
Shun Lei
Yaoxun Xu
Zhiwei Lin
Huaicheng Zhang
Wei Tan
...
Chenyu Yang
Haina Zhu
Shuai Wang
Zhiyong Wu
Dong Yu
167
9
0
09 Jun 2025
Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces
Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces
Kevin Rojas
Yuchen Zhu
Sichen Zhu
Felix X.-F. Ye
Molei Tao
DiffM
186
10
0
09 Jun 2025
Generative Modeling of Weights: Generalization or Memorization?
Generative Modeling of Weights: Generalization or Memorization?
Boya Zeng
Yida Yin
Zhiqiu Xu
Zhuang Liu
DiffM
220
2
0
09 Jun 2025
Audio-Sync Video Generation with Multi-Stream Temporal Control
Audio-Sync Video Generation with Multi-Stream Temporal Control
Shuchen Weng
Haojie Zheng
Zheng Chang
Si Li
Boxin Shi
Xinlong Wang
DiffMVGen
153
4
0
09 Jun 2025
LaTtE-Flow: Layerwise Timestep-Expert Flow-based Transformer
LaTtE-Flow: Layerwise Timestep-Expert Flow-based Transformer
Ying Shen
Zhiyang Xu
Jiuhai Chen
Shizhe Diao
Jiaxin Zhang
Yuguang Yao
Joy Rimchala
Ismini Lourentzou
Lifu Huang
OffRL
156
1
0
08 Jun 2025
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Jiatao Gu
Tianrong Chen
David Berthelot
Huangjie Zheng
Yuyang Wang
Ruixiang Zhang
Laurent Dinh
Miguel Angel Bautista
Josh Susskind
Shuangfei Zhai
207
13
0
06 Jun 2025
Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling
Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling
Yihan Xie
Sijing Li
Tianwei Lin
Zhuonan Wang
Chenglin Yang
...
Tai-wei Chang
Qishan Chen
Jun Xiao
Yueting Zhuang
Beng Chin Ooi
179
2
0
06 Jun 2025
RecGPT: A Foundation Model for Sequential Recommendation
RecGPT: A Foundation Model for Sequential Recommendation
Yangqin Jiang
Xubin Ren
Lianghao Xia
Da Luo
Kangyi Lin
Chao Huang
LRM
287
0
0
06 Jun 2025
Gen-n-Val: Agentic Image Data Generation and Validation
Jing-En Huang
I-Sheng Fang
Tzuhsuan Huang
Chih-Yu Wang
Jun-Cheng Chen
VLM
250
0
0
05 Jun 2025
FocusDiff: Advancing Fine-Grained Text-Image Alignment for Autoregressive Visual Generation through RL
FocusDiff: Advancing Fine-Grained Text-Image Alignment for Autoregressive Visual Generation through RL
Kaihang Pan
Wendong Bu
Y. Wu
Yang Wu
Kai Shen
Yunfei Li
Hang Zhao
Juncheng Billy Li
Siliang Tang
Yueting Zhuang
194
8
0
05 Jun 2025
Physics Informed Capsule Enhanced Variational AutoEncoder for Underwater Image Enhancement
N. Martinel
Rita Pucci
211
0
0
05 Jun 2025
Resolving Task Objective Conflicts in Unified Model via Task-Aware Mixture-of-Experts
Resolving Task Objective Conflicts in Unified Model via Task-Aware Mixture-of-Experts
Jiaxing Zhang
Xinyi Zeng
201
0
0
04 Jun 2025
HMAR: Efficient Hierarchical Masked Auto-Regressive Image GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Hermann Kumbong
Xian Liu
Tsung-Yi Lin
Ming-Yu Liu
Xihui Liu
Ziwei Liu
Daniel Y. Fu
Christopher Ré
David W. Romero
DiffM
168
7
0
04 Jun 2025
One-Step Diffusion-based Real-World Image Super-Resolution with Visual Perception Distillation
One-Step Diffusion-based Real-World Image Super-Resolution with Visual Perception Distillation
Xue Wu
J. Xin
Zhijun Tu
Jie Hu
Jie Li
N. Wang
Xinbo Gao
235
1
0
03 Jun 2025
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models
Mingzhe Li
Gehao Zhang
Zhenting Wang
Guanhong Tao
Siqi Pan
Richard Cartwright
Juan Zhai
Shiqing Ma
DiffM
195
0
0
03 Jun 2025
Hyperspectral Image Generation with Unmixing Guided Diffusion Model
Hyperspectral Image Generation with Unmixing Guided Diffusion Model
Shiyu Shen
Bin Pan
Ziye Zhang
Zhenwei Shi
DiffM
175
0
0
03 Jun 2025
Data Pruning by Information Maximization
Data Pruning by Information MaximizationInternational Conference on Learning Representations (ICLR), 2025
Haoru Tan
Sitong Wu
Wei Huang
Shizhen Zhao
Xiaojuan Qi
291
7
0
02 Jun 2025
Efficiency without Compromise: CLIP-aided Text-to-Image GANs with Increased Diversity
Efficiency without Compromise: CLIP-aided Text-to-Image GANs with Increased Diversity
Yuya Kobayashi
Yuhta Takida
Takashi Shibuya
Yuki Mitsufuji
DiffM
166
0
0
02 Jun 2025
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning
Yijun Yang
Zhao-Yang Wang
Qiuping Liu
Shuwen Sun
Kang Wang
...
Zongwei Zhou
Alan Yuille
Lei Zhu
Yu Zhang
Jieneng Chen
131
10
0
02 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
160
2
0
02 Jun 2025
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Yuan Gan
Jiaxu Miao
Yunze Wang
Yi Yang
AAMLDiffM
130
1
0
02 Jun 2025
Concept-Centric Token Interpretation for Vector-Quantized Generative Models
Concept-Centric Token Interpretation for Vector-Quantized Generative Models
Tianze Yang
Yucheng Shi
Mengnan Du
Xuansheng Wu
Qiaoyu Tan
Jin Sun
Ninghao Liu
193
1
0
31 May 2025
Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning
Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning
Junaid Muzaffar
Ahsan Adeel
K. Ahmed
Ingo Frommholz
Zeeshan Pervez
230
0
0
31 May 2025
DLM-One: Diffusion Language Models for One-Step Sequence Generation
DLM-One: Diffusion Language Models for One-Step Sequence Generation
Tianqi Chen
Shujian Zhang
Mingyuan Zhou
134
5
0
30 May 2025
DarkDiff: Advancing Low-Light Raw Enhancement by Retasking Diffusion Models for Camera ISP
DarkDiff: Advancing Low-Light Raw Enhancement by Retasking Diffusion Models for Camera ISP
Amber Yijia Zheng
Yu Zhang
Jun Hu
Raymond A. Yeh
Chen Chen
DiffM
140
1
0
29 May 2025
PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models
PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models
Fan Fei
Jiajun Tang
Fei-Peng Tian
Boxin Shi
P. Tan
DiffM
158
1
0
28 May 2025
LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation
LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation
Pascal Zwick
Nils Friederich
Maximilian Beichter
Lennart Hilbert
Ralf Mikut
Oliver Bringmann
MedIm
117
0
0
27 May 2025
Generative Image Compression by Estimating Gradients of the Rate-variable Feature Distribution
Generative Image Compression by Estimating Gradients of the Rate-variable Feature Distribution
Minghao Han
Weiyi You
Jinhua Zhang
Leheng Zhang
Ce Zhu
Shuhang Gu
DiffM
198
0
0
27 May 2025
MultLFG: Training-free Multi-LoRA composition using Frequency-domain Guidance
MultLFG: Training-free Multi-LoRA composition using Frequency-domain Guidance
Aniket Roy
Maitreya Suin
Ketul Shah
Rama Chellappa
184
1
0
26 May 2025
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
Jiru Deng
Tengjin Weng
Tianyu Yang
Tong Lu
Zhiheng Li
Wenhao Jiang
VLM
274
0
0
26 May 2025
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation
Yi Wu
Lingting Zhu
Shengju Qian
Lei Liu
Wandi Qiao
Lequan Yu
Bin Li
179
3
0
26 May 2025
DiSA: Diffusion Step Annealing in Autoregressive Image Generation
DiSA: Diffusion Step Annealing in Autoregressive Image Generation
Qinyu Zhao
Jaskirat Singh
Ming Xu
Akshay Asthana
Stephen Gould
Liang Zheng
DiffM
146
2
0
26 May 2025
TeViR: Text-to-Video Reward with Diffusion Models for Efficient Reinforcement Learning
TeViR: Text-to-Video Reward with Diffusion Models for Efficient Reinforcement Learning
Yuhui Chen
Haoran Li
Zhennan Jiang
Haowei Wen
Dongbin Zhao
186
2
0
26 May 2025
ReDDiT: Rehashing Noise for Discrete Visual Generation
ReDDiT: Rehashing Noise for Discrete Visual Generation
Tianren Ma
Xiaosong Zhang
Boyu Yang
Junlan Feng
QiXiang Ye
DiffM
257
2
0
26 May 2025
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Kunjun Li
Zigeng Chen
Cheng-Yen Yang
Jenq-Neng Hwang
182
3
0
26 May 2025
AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models
AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models
Muyao Niu
Mingdeng Cao
Yifan Zhan
Qingtian Zhu
Mingze Ma
Jiancheng Zhao
Yanhong Zeng
Zhihang Zhong
Xiao Sun
Yinqiang Zheng
DiffMVGen
234
5
0
26 May 2025
Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots
Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots
Guangting Zheng
Yehao Li
Yingwei Pan
Jiajun Deng
Ting Yao
Yanyong Zhang
Tao Mei
DiffM
189
1
0
26 May 2025
Previous
123...678...464748
Next