ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.09841
  4. Cited By
Taming Transformers for High-Resolution Image Synthesis

Taming Transformers for High-Resolution Image Synthesis

17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
    ViT
ArXivPDFHTML

Papers citing "Taming Transformers for High-Resolution Image Synthesis"

50 / 476 papers shown
Title
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Tanzila Rahman
Shweta Mahajan
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Leonid Sigal
80
4
0
18 Feb 2024
CoLLaVO: Crayon Large Language and Vision mOdel
CoLLaVO: Crayon Large Language and Vision mOdel
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
VLM
MLLM
24
16
0
17 Feb 2024
Data-efficient Large Vision Models through Sequential Autoregression
Data-efficient Large Vision Models through Sequential Autoregression
Jianyuan Guo
Zhiwei Hao
Chengcheng Wang
Yehui Tang
Han Wu
Han Hu
Kai Han
Chang Xu
VLM
21
10
0
07 Feb 2024
Neural Language of Thought Models
Neural Language of Thought Models
Yi-Fu Wu
Minseung Lee
Sungjin Ahn
MLLM
VLM
48
6
0
02 Feb 2024
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Xinlei Chen
Zhuang Liu
Saining Xie
Kaiming He
DiffM
30
52
0
25 Jan 2024
Image Synthesis with Graph Conditioning: CLIP-Guided Diffusion Models
  for Scene Graphs
Image Synthesis with Graph Conditioning: CLIP-Guided Diffusion Models for Scene Graphs
Rameshwar Mishra
A. V. Subramanyam
DiffM
14
2
0
25 Jan 2024
Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via
  Transformer-Based 360 Image Outpainting
Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via Transformer-Based 360 Image Outpainting
Hao Ai
Zidong Cao
H. Lu
Chen Chen
Jiancang Ma
Pengyuan Zhou
Tae-Kyun Kim
Pan Hui
Lin Wang
35
3
0
19 Jan 2024
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask
  Inpainting
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke
Bert De Brabandere
DiffM
22
11
0
18 Jan 2024
PIXAR: Auto-Regressive Language Modeling in Pixel Space
PIXAR: Auto-Regressive Language Modeling in Pixel Space
Yintao Tai
Xiyang Liao
Alessandro Suglia
Antonio Vergari
MLLM
19
7
0
06 Jan 2024
Discrete Distribution Networks
Discrete Distribution Networks
Lei Yang
23
1
0
29 Dec 2023
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image
  Inpainting with Diffusion Models
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
Hayk Manukyan
Andranik Sargsyan
Barsegh Atanyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
28
28
0
21 Dec 2023
Novel View Synthesis with View-Dependent Effects from a Single Image
Novel View Synthesis with View-Dependent Effects from a Single Image
J. P. Bello
Munchurl Kim
17
2
0
13 Dec 2023
Photorealistic Video Generation with Diffusion Models
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
39
174
0
11 Dec 2023
4M: Massively Multimodal Masked Modeling
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
39
62
0
11 Dec 2023
Iterative Token Evaluation and Refinement for Real-World
  Super-Resolution
Iterative Token Evaluation and Refinement for Real-World Super-Resolution
Chaofeng Chen
Shangchen Zhou
Liang Liao
Haoning Wu
Wenxiu Sun
Qiong Yan
Weisi Lin
13
6
0
09 Dec 2023
Towards Context-Stable and Visual-Consistent Image Inpainting
Towards Context-Stable and Visual-Consistent Image Inpainting
Yikai Wang
Chenjie Cao
Yanwei Fu
DiffM
43
2
0
08 Dec 2023
GenDeF: Learning Generative Deformation Field for Video Generation
GenDeF: Learning Generative Deformation Field for Video Generation
Wen Wang
Kecheng Zheng
Qiuyu Wang
Hao Chen
Zifan Shi
Ceyuan Yang
Yujun Shen
Chunhua Shen
VGen
DiffM
41
2
0
07 Dec 2023
Free3D: Consistent Novel View Synthesis without 3D Representation
Free3D: Consistent Novel View Synthesis without 3D Representation
Chuanxia Zheng
Andrea Vedaldi
3DV
37
48
0
07 Dec 2023
FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models
FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models
Stathis Galanakis
Alexandros Lattas
Stylianos Moschoglou
S. Zafeiriou
19
2
0
07 Dec 2023
Understanding (Un)Intended Memorization in Text-to-Image Generative
  Models
Understanding (Un)Intended Memorization in Text-to-Image Generative Models
Ali Naseh
Jaechul Roh
Amir Houmansadr
DiffM
20
6
0
06 Dec 2023
DiffusionSat: A Generative Foundation Model for Satellite Imagery
DiffusionSat: A Generative Foundation Model for Satellite Imagery
Samar Khanna
Patrick Liu
Linqi Zhou
Chenlin Meng
Robin Rombach
Marshall Burke
David B. Lobell
Stefano Ermon
17
57
0
06 Dec 2023
MMM: Generative Masked Motion Model
MMM: Generative Masked Motion Model
Ekkasit Pinyoanuntapong
Pu Wang
Minwoo Lee
C. L. P. Chen
DiffM
VGen
27
43
0
06 Dec 2023
Kandinsky 3.0 Technical Report
Kandinsky 3.0 Technical Report
V.Ya. Arkhipkin
Andrei Filatov
Viacheslav Vasilev
Anastasia Maltseva
Said Azizov
Igor Pavlov
Julia Agafonova
Andrey Kuznetsov
Denis Dimitrov
DiffM
25
10
0
06 Dec 2023
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
Chao Chen
Tian Zhou
Yanjun Zhao
Hui Liu
Liang Sun
Rong Jin
25
0
0
06 Dec 2023
Identifying Spurious Correlations using Counterfactual Alignment
Identifying Spurious Correlations using Counterfactual Alignment
Joseph Paul Cohen
Louis Blankemeier
Akshay S. Chaudhari
CML
49
1
0
01 Dec 2023
InstructSeq: Unifying Vision Tasks with Instruction-conditioned
  Multi-modal Sequence Generation
InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Rongyao Fang
Shilin Yan
Zhaoyang Huang
Jingqiu Zhou
Hao Tian
Jifeng Dai
Hongsheng Li
MLLM
28
8
0
30 Nov 2023
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Jin-Chuan Shi
Miao Wang
Hao-Bin Duan
Shao-Hua Guan
3DGS
25
84
0
30 Nov 2023
Image Inpainting via Tractable Steering of Diffusion Models
Image Inpainting via Tractable Steering of Diffusion Models
Anji Liu
Mathias Niepert
Guy Van den Broeck
DiffM
TPM
19
16
0
28 Nov 2023
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
DiffM
21
43
0
28 Nov 2023
Unsupervised Multimodal Deepfake Detection Using Intra- and Cross-Modal
  Inconsistencies
Unsupervised Multimodal Deepfake Detection Using Intra- and Cross-Modal Inconsistencies
Mulin Tian
Mahyar Khayatkhoei
Joe Mathai
Wael AbdAlmageed
21
6
0
28 Nov 2023
Text-Driven Image Editing via Learnable Regions
Text-Driven Image Editing via Learnable Regions
Yuanze Lin
Yi-Wen Chen
Yi-Hsuan Tsai
Lu Jiang
Ming-Hsuan Yang
DiffM
21
16
0
28 Nov 2023
Using Human Feedback to Fine-tune Diffusion Models without Any Reward
  Model
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Kai Yang
Jian Tao
Jiafei Lyu
Chunjiang Ge
Jiaxin Chen
Qimai Li
Weihan Shen
Xiaolong Zhu
Xiu Li
EGVM
19
87
0
22 Nov 2023
Event Camera Data Dense Pre-training
Event Camera Data Dense Pre-training
Yan Yang
Liyuan Pan
Liu Liu
25
4
0
20 Nov 2023
MDFL: Multi-domain Diffusion-driven Feature Learning
MDFL: Multi-domain Diffusion-driven Feature Learning
Daixun Li
Weiying Xie
Jiaqing Zhang
Yunsong Li
DiffM
35
8
0
16 Nov 2023
3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with
  2D Diffusion Models
3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models
Haibo Yang
Yang Chen
Yingwei Pan
Ting Yao
Zhineng Chen
Tao Mei
19
19
0
09 Nov 2023
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion
  Models
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Z. Qin
Xiang Wang
Deli Zhao
Jingren Zhou
DiffM
VGen
26
198
0
07 Nov 2023
Scene Graph Conditioning in Latent Diffusion
Scene Graph Conditioning in Latent Diffusion
Frank Fundel
DiffM
25
0
0
16 Oct 2023
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Kuan Tian
Yonghang Guan
Jin-Peng Xiang
Jun Zhang
Xiao Han
Wei Yang
32
1
0
16 Oct 2023
MAC: ModAlity Calibration for Object Detection
MAC: ModAlity Calibration for Object Detection
Yutian Lei
Jun Liu
Dong Huang
ObjD
13
0
0
14 Oct 2023
HyperHuman: Hyper-Realistic Human Generation with Latent Structural
  Diffusion
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
Xian Liu
Jian Ren
Aliaksandr Siarohin
Ivan Skorokhodov
Yanyu Li
Dahua Lin
Xihui Liu
Ziwei Liu
Sergey Tulyakov
32
57
0
12 Oct 2023
Causal Unsupervised Semantic Segmentation
Causal Unsupervised Semantic Segmentation
Junho Kim
Byung-Kwan Lee
Yonghyun Ro
31
18
0
11 Oct 2023
Improving Compositional Text-to-image Generation with Large
  Vision-Language Models
Improving Compositional Text-to-image Generation with Large Vision-Language Models
Song Wen
Guian Fang
Renrui Zhang
Peng Gao
Hao Dong
Dimitris N. Metaxas
21
17
0
10 Oct 2023
Memory-Consistent Neural Networks for Imitation Learning
Memory-Consistent Neural Networks for Imitation Learning
Kaustubh Sridhar
Souradeep Dutta
Dinesh Jayaraman
James Weimer
Insup Lee
36
8
0
09 Oct 2023
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video
  editing
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Yuren Cong
Mengmeng Xu
Christian Simon
Shoufa Chen
Jiawei Ren
Yanping Xie
Juan-Manuel Perez-Rua
Bodo Rosenhahn
Tao Xiang
Sen He
DiffM
VGen
22
74
0
09 Oct 2023
KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image
  Action Editing
KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing
Jiarui Yao
Yifan Liu
Simon S. Du
Shifeng Chen
DiffM
16
24
0
28 Sep 2023
Jointly Training Large Autoregressive Multimodal Models
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello
L. Yu
Yixin Nie
Armen Aghajanyan
Barlas Oğuz
11
29
0
27 Sep 2023
Diffusion-based Holistic Texture Rectification and Synthesis
Diffusion-based Holistic Texture Rectification and Synthesis
Guoqing Hao
S. Iizuka
Kensho Hara
E. Simo-Serra
Hirokatsu Kataoka
Kazuhiro Fukui
DiffM
10
5
0
26 Sep 2023
Neural Image Compression Using Masked Sparse Visual Representation
Neural Image Compression Using Masked Sparse Visual Representation
Wei Jiang
Wei Wang
Yuewei Chen
13
7
0
20 Sep 2023
CoNeS: Conditional neural fields with shift modulation for
  multi-sequence MRI translation
CoNeS: Conditional neural fields with shift modulation for multi-sequence MRI translation
Yunjie Chen
Marius Staring
O. M. Neve
Stephan R. Romeijn
Erik F. Hensen
Berit M. Verbist
J. Wolterink
Qian Tao
DiffM
MedIm
16
3
0
06 Sep 2023
VideoGen: A Reference-Guided Latent Diffusion Approach for High
  Definition Text-to-Video Generation
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Xin Li
Wenqing Chu
Ye Wu
Weihang Yuan
Fanglong Liu
Qi Zhang
Fu Li
Haocheng Feng
Errui Ding
Jingdong Wang
VGen
40
51
0
01 Sep 2023
Previous
123456...8910
Next