Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 865 papers shown
Title
Vitruvio: 3D Building Meshes via Single Perspective Sketches
Alberto Tono
Heyaojing Huang
Ashwin Agrawal
Martin Fischer
13
5
0
24 Oct 2022
Instance-Aware Image Completion
Ji-Ho Cho
Minguk Kang
Vibhav Vineet
Jaesik Park
ISeg
VLM
15
2
0
22 Oct 2022
SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity Representation
Zekun Li
Jina Kim
Yao-Yi Chiang
Muhao Chen
73
28
0
21 Oct 2022
3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows
Vivian Liu
Jo Vermeulen
G. Fitzmaurice
Justin Matejka
HAI
25
116
0
20 Oct 2022
Composing Ensembles of Pre-trained Models via Iterative Consensus
Shuang Li
Yilun Du
J. Tenenbaum
Antonio Torralba
Igor Mordatch
MoMe
19
23
0
20 Oct 2022
Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay
Jason W. Wei
Hyung Won Chung
Vinh Q. Tran
David R. So
...
Donald Metzler
Slav Petrov
N. Houlsby
Quoc V. Le
Mostafa Dehghani
LRM
24
67
0
20 Oct 2022
OCR-VQGAN: Taming Text-within-Image Generation
Juan A. Rodriguez
David Vazquez
I. Laradji
M. Pedersoli
Pau Rodríguez López
19
18
0
19 Oct 2022
Optimizing Hierarchical Image VAEs for Sample Quality
Eric Luhman
Troy Luhman
DRL
19
4
0
18 Oct 2022
Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works
Hyung-Kwon Ko
Gwanmo Park
Hyeon Jeon
Jaemin Jo
Juho Kim
Jinwook Seo
16
138
0
16 Oct 2022
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
8
3,233
0
16 Oct 2022
DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models
Zeyang Sha
Zheng Li
Ning Yu
Yang Zhang
DiffM
6
113
0
13 Oct 2022
Underspecification in Scene Description-to-Depiction Tasks
Ben Hutchinson
Jason Baldridge
Vinodkumar Prabhakaran
DiffM
66
32
0
11 Oct 2022
Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng
Noriyuki Kojima
Alexander M. Rush
DiffM
29
4
0
11 Oct 2022
Can Artificial Intelligence Reconstruct Ancient Mosaics?
Fernando Moral-Andrés
Elena Merino-Gómez
Pedro Reviriego
Fabrizio Lombardi
14
6
0
07 Oct 2022
On Distillation of Guided Diffusion Models
Chenlin Meng
Robin Rombach
Ruiqi Gao
Diederik P. Kingma
Stefano Ermon
Jonathan Ho
Tim Salimans
VLM
DiffM
14
490
0
06 Oct 2022
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Aishwarya Kamath
Peter Anderson
Su Wang
Jing Yu Koh
Alexander Ku
Austin Waters
Yinfei Yang
Jason Baldridge
Zarana Parekh
LM&Ro
15
45
0
06 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
19
370
0
05 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
15
1,467
0
05 Oct 2022
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
70
4
0
05 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
51
81
0
03 Oct 2022
Membership Inference Attacks Against Text-to-image Generation Models
Yixin Wu
Ning Yu
Zheng Li
Michael Backes
Yang Zhang
DiffM
6
65
0
03 Oct 2022
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
17
288
0
30 Sep 2022
Understanding Pure CLIP Guidance for Voxel Grid NeRF Models
Han-Hung Lee
Angel X. Chang
14
63
0
30 Sep 2022
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole
Ajay Jain
Jonathan T. Barron
B. Mildenhall
47
2,302
0
29 Sep 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffM
VGen
22
1,339
0
29 Sep 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
114
161
0
29 Sep 2022
Learning to Learn with Generative Models of Neural Network Checkpoints
William S. Peebles
Ilija Radosavovic
Tim Brooks
Alexei A. Efros
Jitendra Malik
UQCV
73
64
0
26 Sep 2022
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
VLM
11
312
0
25 Sep 2022
Extremely Simple Activation Shaping for Out-of-Distribution Detection
Andrija Djurisic
Nebojsa Bozanic
Arjun Ashok
Rosanne Liu
OODD
158
148
0
20 Sep 2022
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
Lukas Struppek
Dominik Hintersdorf
Felix Friedrich
Manuel Brack
P. Schramowski
Kristian Kersting
66
26
0
19 Sep 2022
Does CLIP Know My Face?
Dominik Hintersdorf
Lukas Struppek
Manuel Brack
Felix Friedrich
P. Schramowski
Kristian Kersting
VLM
13
9
0
15 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
13
561
0
07 Sep 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
14
2,688
0
25 Aug 2022
Text to Image Generation: Leaving no Language Behind
Pedro Reviriego
Elena Merino-Gómez
VLM
8
13
0
19 Aug 2022
Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines
Patrick Flynn
T. Vanderbruggen
C. Liao
Pei-Hung Lin
M. Emani
Xipeng Shen
11
4
0
11 Aug 2022
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
CLIP
VLM
27
97
0
10 Aug 2022
Adversarial Attacks on Image Generation With Made-Up Words
Raphael Milliere
23
38
0
04 Aug 2022
DALLE-URBAN: Capturing the urban design expertise of large text to image transformers
Sachith Seneviratne
Damith A. Senanayake
Sanka Rasnayaka
Rajith Vidanaarachchi
Jason Thompson
ViT
8
17
0
03 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
17
1,680
0
02 Aug 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal
Yuval Alaluf
Y. Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
34
1,777
0
02 Aug 2022
Lighting (In)consistency of Paint by Text
Hany Farid
6
31
0
27 Jul 2022
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Robin Rombach
A. Blattmann
Bjorn Ommer
DiffM
14
68
0
26 Jul 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu
Jian Liang
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
10
72
0
20 Jul 2022
Perspective (In)consistency of Paint by Text
Hany Farid
DiffM
19
36
0
27 Jun 2022
Worldwide AI Ethics: a review of 200 guidelines and recommendations for AI governance
N. Corrêa
Camila Galvão
J. Santos
C. Pino
Edson Pontes Pinto
...
Diogo Massmann
Rodrigo Mambrini
Luiza Galvao
Edmund Terem
Nythamar Fernandes de Oliveira
24
83
0
23 Jun 2022
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjD
VLM
MLLM
45
391
0
17 Jun 2022
Write and Paint: Generative Vision-Language Models are Unified Modal Learners
Shizhe Diao
Wangchunshu Zhou
Xinsong Zhang
Jiawei Wang
MLLM
AI4CE
14
15
0
15 Jun 2022
Blended Latent Diffusion
Omri Avrahami
Ohad Fried
Dani Lischinski
DiffM
50
374
0
06 Jun 2022
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
26
5
0
25 Apr 2022
Opal: Multimodal Image Generation for News Illustration
Vivian Liu
Han Qiao
Lydia B. Chilton
11
98
0
19 Apr 2022
Previous
1
2
3
...
16
17
18
Next