Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.09841
Cited By
Taming Transformers for High-Resolution Image Synthesis
17 December 2020
Patrick Esser
Robin Rombach
Bjorn Ommer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Taming Transformers for High-Resolution Image Synthesis"
50 / 476 papers shown
Title
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Tanzila Rahman
Shweta Mahajan
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Leonid Sigal
80
4
0
18 Feb 2024
CoLLaVO: Crayon Large Language and Vision mOdel
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
VLM
MLLM
24
16
0
17 Feb 2024
Data-efficient Large Vision Models through Sequential Autoregression
Jianyuan Guo
Zhiwei Hao
Chengcheng Wang
Yehui Tang
Han Wu
Han Hu
Kai Han
Chang Xu
VLM
21
10
0
07 Feb 2024
Neural Language of Thought Models
Yi-Fu Wu
Minseung Lee
Sungjin Ahn
MLLM
VLM
48
6
0
02 Feb 2024
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Xinlei Chen
Zhuang Liu
Saining Xie
Kaiming He
DiffM
30
52
0
25 Jan 2024
Image Synthesis with Graph Conditioning: CLIP-Guided Diffusion Models for Scene Graphs
Rameshwar Mishra
A. V. Subramanyam
DiffM
14
2
0
25 Jan 2024
Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via Transformer-Based 360 Image Outpainting
Hao Ai
Zidong Cao
H. Lu
Chen Chen
Jiancang Ma
Pengyuan Zhou
Tae-Kyun Kim
Pan Hui
Lin Wang
35
3
0
19 Jan 2024
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke
Bert De Brabandere
DiffM
22
11
0
18 Jan 2024
PIXAR: Auto-Regressive Language Modeling in Pixel Space
Yintao Tai
Xiyang Liao
Alessandro Suglia
Antonio Vergari
MLLM
19
7
0
06 Jan 2024
Discrete Distribution Networks
Lei Yang
23
1
0
29 Dec 2023
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
Hayk Manukyan
Andranik Sargsyan
Barsegh Atanyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
28
28
0
21 Dec 2023
Novel View Synthesis with View-Dependent Effects from a Single Image
J. P. Bello
Munchurl Kim
17
2
0
13 Dec 2023
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
39
174
0
11 Dec 2023
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
39
62
0
11 Dec 2023
Iterative Token Evaluation and Refinement for Real-World Super-Resolution
Chaofeng Chen
Shangchen Zhou
Liang Liao
Haoning Wu
Wenxiu Sun
Qiong Yan
Weisi Lin
13
6
0
09 Dec 2023
Towards Context-Stable and Visual-Consistent Image Inpainting
Yikai Wang
Chenjie Cao
Yanwei Fu
DiffM
43
2
0
08 Dec 2023
GenDeF: Learning Generative Deformation Field for Video Generation
Wen Wang
Kecheng Zheng
Qiuyu Wang
Hao Chen
Zifan Shi
Ceyuan Yang
Yujun Shen
Chunhua Shen
VGen
DiffM
41
2
0
07 Dec 2023
Free3D: Consistent Novel View Synthesis without 3D Representation
Chuanxia Zheng
Andrea Vedaldi
3DV
37
48
0
07 Dec 2023
FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models
Stathis Galanakis
Alexandros Lattas
Stylianos Moschoglou
S. Zafeiriou
19
2
0
07 Dec 2023
Understanding (Un)Intended Memorization in Text-to-Image Generative Models
Ali Naseh
Jaechul Roh
Amir Houmansadr
DiffM
20
6
0
06 Dec 2023
DiffusionSat: A Generative Foundation Model for Satellite Imagery
Samar Khanna
Patrick Liu
Linqi Zhou
Chenlin Meng
Robin Rombach
Marshall Burke
David B. Lobell
Stefano Ermon
17
57
0
06 Dec 2023
MMM: Generative Masked Motion Model
Ekkasit Pinyoanuntapong
Pu Wang
Minwoo Lee
C. L. P. Chen
DiffM
VGen
27
43
0
06 Dec 2023
Kandinsky 3.0 Technical Report
V.Ya. Arkhipkin
Andrei Filatov
Viacheslav Vasilev
Anastasia Maltseva
Said Azizov
Igor Pavlov
Julia Agafonova
Andrey Kuznetsov
Denis Dimitrov
DiffM
25
10
0
06 Dec 2023
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
Chao Chen
Tian Zhou
Yanjun Zhao
Hui Liu
Liang Sun
Rong Jin
25
0
0
06 Dec 2023
Identifying Spurious Correlations using Counterfactual Alignment
Joseph Paul Cohen
Louis Blankemeier
Akshay S. Chaudhari
CML
49
1
0
01 Dec 2023
InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Rongyao Fang
Shilin Yan
Zhaoyang Huang
Jingqiu Zhou
Hao Tian
Jifeng Dai
Hongsheng Li
MLLM
28
8
0
30 Nov 2023
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Jin-Chuan Shi
Miao Wang
Hao-Bin Duan
Shao-Hua Guan
3DGS
25
84
0
30 Nov 2023
Image Inpainting via Tractable Steering of Diffusion Models
Anji Liu
Mathias Niepert
Guy Van den Broeck
DiffM
TPM
19
16
0
28 Nov 2023
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
DiffM
21
43
0
28 Nov 2023
Unsupervised Multimodal Deepfake Detection Using Intra- and Cross-Modal Inconsistencies
Mulin Tian
Mahyar Khayatkhoei
Joe Mathai
Wael AbdAlmageed
21
6
0
28 Nov 2023
Text-Driven Image Editing via Learnable Regions
Yuanze Lin
Yi-Wen Chen
Yi-Hsuan Tsai
Lu Jiang
Ming-Hsuan Yang
DiffM
21
16
0
28 Nov 2023
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Kai Yang
Jian Tao
Jiafei Lyu
Chunjiang Ge
Jiaxin Chen
Qimai Li
Weihan Shen
Xiaolong Zhu
Xiu Li
EGVM
19
87
0
22 Nov 2023
Event Camera Data Dense Pre-training
Yan Yang
Liyuan Pan
Liu Liu
25
4
0
20 Nov 2023
MDFL: Multi-domain Diffusion-driven Feature Learning
Daixun Li
Weiying Xie
Jiaqing Zhang
Yunsong Li
DiffM
35
8
0
16 Nov 2023
3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models
Haibo Yang
Yang Chen
Yingwei Pan
Ting Yao
Zhineng Chen
Tao Mei
19
19
0
09 Nov 2023
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Z. Qin
Xiang Wang
Deli Zhao
Jingren Zhou
DiffM
VGen
26
198
0
07 Nov 2023
Scene Graph Conditioning in Latent Diffusion
Frank Fundel
DiffM
25
0
0
16 Oct 2023
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Kuan Tian
Yonghang Guan
Jin-Peng Xiang
Jun Zhang
Xiao Han
Wei Yang
32
1
0
16 Oct 2023
MAC: ModAlity Calibration for Object Detection
Yutian Lei
Jun Liu
Dong Huang
ObjD
13
0
0
14 Oct 2023
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
Xian Liu
Jian Ren
Aliaksandr Siarohin
Ivan Skorokhodov
Yanyu Li
Dahua Lin
Xihui Liu
Ziwei Liu
Sergey Tulyakov
32
57
0
12 Oct 2023
Causal Unsupervised Semantic Segmentation
Junho Kim
Byung-Kwan Lee
Yonghyun Ro
31
18
0
11 Oct 2023
Improving Compositional Text-to-image Generation with Large Vision-Language Models
Song Wen
Guian Fang
Renrui Zhang
Peng Gao
Hao Dong
Dimitris N. Metaxas
21
17
0
10 Oct 2023
Memory-Consistent Neural Networks for Imitation Learning
Kaustubh Sridhar
Souradeep Dutta
Dinesh Jayaraman
James Weimer
Insup Lee
36
8
0
09 Oct 2023
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Yuren Cong
Mengmeng Xu
Christian Simon
Shoufa Chen
Jiawei Ren
Yanping Xie
Juan-Manuel Perez-Rua
Bodo Rosenhahn
Tao Xiang
Sen He
DiffM
VGen
22
74
0
09 Oct 2023
KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing
Jiarui Yao
Yifan Liu
Simon S. Du
Shifeng Chen
DiffM
16
24
0
28 Sep 2023
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello
L. Yu
Yixin Nie
Armen Aghajanyan
Barlas Oğuz
11
29
0
27 Sep 2023
Diffusion-based Holistic Texture Rectification and Synthesis
Guoqing Hao
S. Iizuka
Kensho Hara
E. Simo-Serra
Hirokatsu Kataoka
Kazuhiro Fukui
DiffM
10
5
0
26 Sep 2023
Neural Image Compression Using Masked Sparse Visual Representation
Wei Jiang
Wei Wang
Yuewei Chen
13
7
0
20 Sep 2023
CoNeS: Conditional neural fields with shift modulation for multi-sequence MRI translation
Yunjie Chen
Marius Staring
O. M. Neve
Stephan R. Romeijn
Erik F. Hensen
Berit M. Verbist
J. Wolterink
Qian Tao
DiffM
MedIm
16
3
0
06 Sep 2023
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Xin Li
Wenqing Chu
Ye Wu
Weihang Yuan
Fanglong Liu
Qi Zhang
Fu Li
Haocheng Feng
Errui Ding
Jingdong Wang
VGen
40
51
0
01 Sep 2023
Previous
1
2
3
4
5
6
...
8
9
10
Next