Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 1,010 papers shown
Character-Aware Models Improve Visual Text Rendering
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Rosanne Liu
Daniel H Garrette
Chitwan Saharia
William Chan
Adam Roberts
Sharan Narang
Irina Blok
R. Mical
Mohammad Norouzi
Noah Constant
VLM
262
87
0
20 Dec 2022
Benchmarking Spatial Relationships in Text-to-Image Generation
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
361
86
0
20 Dec 2022
Scalable Diffusion Models with Transformers
IEEE International Conference on Computer Vision (ICCV), 2022
William S. Peebles
Saining Xie
GNN
2.3K
4,295
0
19 Dec 2022
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Alex Nichol
Heewoo Jun
Prafulla Dhariwal
Pamela Mishkin
Mark Chen
DiffM
353
769
0
16 Dec 2022
CLIPPO: Image-and-Language Understanding from Pixels Only
Computer Vision and Pattern Recognition (CVPR), 2022
Michael Tschannen
Basil Mustafa
N. Houlsby
CLIP
VLM
340
72
0
15 Dec 2022
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Computer Vision and Pattern Recognition (CVPR), 2022
Su Wang
Chitwan Saharia
Ceslee Montgomery
Jordi Pont-Tuset
Shai Noy
...
Radu Soricut
Jason Baldridge
Mohammad Norouzi
Peter Anderson
William Chan
227
252
0
13 Dec 2022
Elixir: Train a Large Language Model on a Small GPU Cluster
Haichen Huang
Jiarui Fang
Hongxin Liu
Shenggui Li
Yang You
VLM
250
10
0
10 Dec 2022
MAGVIT: Masked Generative Video Transformer
Computer Vision and Pattern Recognition (CVPR), 2022
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
294
333
0
10 Dec 2022
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
International Conference on Learning Representations (ICLR), 2022
Weixi Feng
Xuehai He
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
Xinze Wang
William Yang Wang
CoGe
587
382
0
09 Dec 2022
Multi-Concept Customization of Text-to-Image Diffusion
Computer Vision and Pattern Recognition (CVPR), 2022
Nupur Kumari
Bin Zhang
Richard Y. Zhang
Eli Shechtman
Jun-Yan Zhu
731
1,168
0
08 Dec 2022
Diffusion Guided Domain Adaptation of Image Generators
Kunpeng Song
Ligong Han
Bingchen Liu
Dimitris N. Metaxas
Ahmed Elgammal
DiffM
264
40
0
08 Dec 2022
SINE: SINgle Image Editing with Text-to-Image Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Zhixing Zhang
Ligong Han
Arna Ghosh
Dimitris N. Metaxas
Jian Ren
DiffM
458
180
0
08 Dec 2022
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion
International Conference on Machine Learning (ICML), 2022
Hanqing Zhao
Dianmo Sheng
Jianmin Bao
Dongdong Chen
Dong Chen
...
Ce Liu
Wenbo Zhou
Qi Chu
Weiming Zhang
Neng H. Yu
VLM
DiffM
234
59
0
07 Dec 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2022
Yuchao Gu
Xintao Wang
Yixiao Ge
Ying Shan
Xiaohu Qie
Mike Zheng Shou
DiffM
219
30
0
06 Dec 2022
Image Inpainting via Iteratively Decoupled Probabilistic Modeling
International Conference on Learning Representations (ICLR), 2022
Wenbo Li
Xin Yu
Kun Zhou
Yibing Song
Zhe Lin
Jiaya Jia
DiffM
201
16
0
06 Dec 2022
M-VADER: A Model for Diffusion with Multimodal Context
Samuel Weinbach
Marco Bellagente
C. Eichenberg
Andrew M. Dai
R. Baldock
Souradeep Nanda
Bjorn Deiseroth
Koen Oostermeijer
H. Teufel
Andres Felipe Cruz Salinas
DiffM
337
11
0
06 Dec 2022
3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
Gimin Nam
Mariem Khlifi
Andrew Rodriguez
Alberto Tono
Linqi Zhou
Paul Guerrero
DiffM
264
79
0
01 Dec 2022
CLIPascene: Scene Sketching with Different Types and Levels of Abstraction
IEEE International Conference on Computer Vision (ICCV), 2022
Yael Vinker
Yuval Alaluf
Daniel Cohen-Or
Ariel Shamir
CLIP
248
77
0
30 Nov 2022
Fast Inference from Transformers via Speculative Decoding
International Conference on Machine Learning (ICML), 2022
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
634
1,151
0
30 Nov 2022
High-Fidelity Guided Image Synthesis with Latent Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Jaskirat Singh
Stephen Gould
Liang Zheng
DiffM
195
51
0
30 Nov 2022
Continuous diffusion for categorical data
Sander Dieleman
Laurent Sartran
Arman Roshannai
Nikolay Savinov
Yaroslav Ganin
...
Conor Durkan
Curtis Hawthorne
Rémi Leblond
Will Grathwohl
J. Adler
DiffM
334
144
0
28 Nov 2022
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
International Conference on Learning Representations (ICLR), 2022
Minghui Hu
Chuanxia Zheng
Heliang Zheng
Tat-Jen Cham
Chaoyue Wang
Zuopeng Yang
Dacheng Tao
Ponnuthurai Nagaratnam Suganthan
DiffM
273
36
0
27 Nov 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Omri Avrahami
Thomas Hayes
Oran Gafni
Sonal Gupta
Yaniv Taigman
Devi Parikh
Dani Lischinski
Ohad Fried
Xiaoyue Yin
DiffM
278
247
0
25 Nov 2022
3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models
Gang Li
Heliang Zheng
Chaoyue Wang
Chang Li
C. Zheng
Dacheng Tao
DiffM
517
63
0
25 Nov 2022
Shifted Diffusion for Text-to-image Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Jiuxiang Gu
Bingchen Liu
Yizhe Zhu
Xiao Yang
Changyou Chen
Jinhui Xu
DiffM
295
57
0
24 Nov 2022
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Binxin Yang
Shuyang Gu
Bo Zhang
Ting Zhang
Xuejin Chen
Xiaoyan Sun
Dong Chen
Fang Wen
DiffM
297
546
0
23 Nov 2022
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
R. Burgert
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffM
VLM
290
41
0
23 Nov 2022
ReCo: Region-Controlled Text-to-Image Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Linjie Li
Kevin Qinghong Lin
...
Nan Duan
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
DiffM
274
188
0
23 Nov 2022
Inversion-Based Style Transfer with Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Yuxin Zhang
Nisha Huang
Fan Tang
Haibin Huang
Chongyang Ma
Weiming Dong
Changsheng Xu
DiffM
311
394
0
23 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Tsu-Jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
412
40
0
23 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
International Conference on Machine Learning (ICML), 2023
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Abigail Z. Jacobs
M. Lewis
Luke Zettlemoyer
Anuj Kumar
RALM
252
132
0
22 Nov 2022
Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark
Vitali Petsiuk
Alexander E. Siemenn
Saisamrit Surbehera
Zad Chin
Keith Tyser
...
Ori Kerret
Tonio Buonassisi
Kate Saenko
Armando Solar-Lezama
Iddo Drori
VLM
117
45
0
22 Nov 2022
SceneComposer: Any-Level Semantic Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2022
Yu Zeng
Zhe Lin
Jianming Zhang
Qing Liu
John Collomosse
Jason Kuen
Vishal M. Patel
DiffM
149
53
0
21 Nov 2022
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Ajay Jain
Amber Xie
Pieter Abbeel
DiffM
214
118
0
21 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffM
VGen
394
467
0
20 Nov 2022
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Xichen Pan
Pengda Qin
Yuhong Li
Hui Xue
Wenhu Chen
DiffM
244
79
0
20 Nov 2022
Visual Programming: Compositional visual reasoning without training
Computer Vision and Pattern Recognition (CVPR), 2022
Tanmay Gupta
Aniruddha Kembhavi
ReLM
VLM
LRM
436
571
0
18 Nov 2022
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models
Ninareh Mehrabi
Palash Goyal
Apurv Verma
Jwala Dhamala
Varun Kumar
Qian Hu
Kai-Wei Chang
R. Zemel
Aram Galstyan
Rahul Gupta
200
8
0
17 Nov 2022
Will Large-scale Generative Models Corrupt Future Datasets?
IEEE International Conference on Computer Vision (ICCV), 2022
Ryuichiro Hataya
Han Bao
Hiromi Arai
239
72
0
15 Nov 2022
Cross-Reality Re-Rendering: Manipulating between Digital and Physical Realities
Siddhartha Datta
241
1
0
15 Nov 2022
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces
Dominic Rampas
Pablo Pernias
Marc Aubreville
DiffM
140
14
0
14 Nov 2022
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Taehoon Kim
Mark A Marsden
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Alessandra Sala
S. Kim
VLM
220
5
0
13 Nov 2022
SSGVS: Semantic Scene Graph-to-Video Synthesis
Yuren Cong
Jinhui Yi
Bodo Rosenhahn
M. Yang
242
8
0
11 Nov 2022
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
IEEE International Conference on Computer Vision (ICCV), 2022
Lukas Struppek
Dominik Hintersdorf
Kristian Kersting
SILM
461
53
0
04 Nov 2022
Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Andreas Stöckl
212
27
0
03 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
618
983
0
02 Nov 2022
MagicMix: Semantic Mixing with Diffusion Models
Jun Hao Liew
Hanshu Yan
Daquan Zhou
Jiashi Feng
DiffM
369
75
0
28 Oct 2022
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance
Wei Li
Xue Xu
Xinyan Xiao
Jiacheng Liu
Hu Yang
...
Zhanpeng Wang
Zhifan Feng
Qiaoqiao She
Yajuan Lyu
Hua Wu
504
31
0
28 Oct 2022
Deep Generative Models on 3D Representations: A Survey
Zifan Shi
Sida Peng
Yinghao Xu
Andreas Geiger
Yiyi Liao
Yujun Shen
MedIm
3DV
319
0
0
27 Oct 2022
In-context Reinforcement Learning with Algorithm Distillation
International Conference on Learning Representations (ICLR), 2022
Michael Laskin
Luyu Wang
Junhyuk Oh
Emilio Parisotto
Stephen Spencer
...
Ethan A. Brooks
Maxime Gazeau
Himanshu Sahni
Satinder Singh
Volodymyr Mnih
OffRL
230
167
0
25 Oct 2022
Previous
1
2
3
...
18
19
20
21
Next