Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 1,011 papers shown
Looking at words and points with attention: a benchmark for text-to-shape coherence
Andrea Amaduzzi
Giuseppe Lisanti
Samuele Salti
Luigi Di Stefano
149
3
0
14 Sep 2023
Masked Generative Modeling with Enhanced Sampling Scheme
Daesoo Lee
Erlend Aune
Sara Malacarne
DiffM
193
4
0
14 Sep 2023
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
International Conference on Learning Representations (ICLR), 2023
Xingchao Liu
Xiwen Zhang
Jianzhu Ma
Jian Peng
Qiang Liu
595
312
0
12 Sep 2023
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Li Chen
Mengyi Zhao
Yiheng Liu
Mingxu Ding
Yangyang Song
...
Xu Wang
Hao Yang
Jing Liu
Kang Du
Min Zheng
DiffM
166
75
0
11 Sep 2023
ITI-GEN: Inclusive Text-to-Image Generation
IEEE International Conference on Computer Vision (ICCV), 2023
Cheng Zhang
Xuanbai Chen
Siqi Chai
Chen Henry Wu
Dmitry Lagun
Thabo Beeler
Fernando de la Torre
VLM
251
79
0
11 Sep 2023
NExT-GPT: Any-to-Any Multimodal LLM
International Conference on Machine Learning (ICML), 2023
Shengqiong Wu
Hao Fei
Leigang Qu
Wei Ji
Tat-Seng Chua
MLLM
379
717
0
11 Sep 2023
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
International Conference on Learning Representations (ICLR), 2023
Yang Jin
Kun Xu
Kun Xu
Liwei Chen
Chao Liao
...
Xiaoqiang Lei
Chen Zhang
Wenwu Ou
Kun Gai
Yadong Mu
MLLM
VLM
240
76
0
09 Sep 2023
MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask
International Journal of Computer Vision (IJCV), 2023
Yupeng Zhou
Daquan Zhou
Zuo-Liang Zhu
Yaxing Wang
Qibin Hou
Jiashi Feng
171
13
0
08 Sep 2023
Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2023
Jiapeng Zhu
Ceyuan Yang
Kecheng Zheng
Yinghao Xu
Zifan Shi
Yujun Shen
MoE
262
14
0
07 Sep 2023
Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Jiaxi Gu
Shicong Wang
Haoyu Zhao
Tianyi Lu
Xing Zhang
Zuxuan Wu
Songcen Xu
Wei Zhang
Yu-Gang Jiang
Hang Xu
DiffM
VGen
224
57
0
07 Sep 2023
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
L. Yu
Bowen Shi
Ramakanth Pasunuru
Benjamin Muller
O. Yu. Golovneva
...
Yaniv Taigman
Maryam Fazel-Zarandi
Asli Celikyilmaz
Luke Zettlemoyer
Armen Aghajanyan
MLLM
277
164
0
05 Sep 2023
Breaking Barriers to Creative Expression: Co-Designing and Implementing an Accessible Text-to-Image Interface
Atieh Taheri
Mohammad Izadi
Gururaj Shriram
Negar Rostamzadeh
Shaun Kane
DiffM
168
3
0
05 Sep 2023
MAGMA: Music Aligned Generative Motion Autodecoder
Sohan Anisetty
Amit Raj
James Hays
151
0
0
03 Sep 2023
Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities
Shanyuan Liu
Dawei Leng
Yuhui Yin
DiffM
140
9
0
02 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Fengxiang Bie
Jianlong Wu
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
254
56
0
02 Sep 2023
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Xin Li
Wenqing Chu
Ye Wu
Weihang Yuan
Fanglong Liu
Tao Gui
Fu Li
Haocheng Feng
Errui Ding
Jingdong Wang
VGen
332
70
0
01 Sep 2023
Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images
IEEE International Conference on Computer Vision (ICCV), 2023
Cuican Yu
Guansong Lu
Yihan Zeng
Jian Sun
Xiaodan Liang
Huibin Li
Zongben Xu
Songcen Xu
Wei Zhang
Hang Xu
231
19
0
31 Aug 2023
Priority-Centric Human Motion Generation in Discrete Latent Space
IEEE International Conference on Computer Vision (ICCV), 2023
Hanyang Kong
Kehong Gong
Dongze Lian
Michael Bi Mi
Xinchao Wang
DiffM
448
69
0
28 Aug 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Lin Geng Foo
Hossein Rahmani
Jing Liu
760
47
0
27 Aug 2023
Dense Text-to-Image Generation with Attention Modulation
IEEE International Conference on Computer Vision (ICCV), 2023
Yunji Kim
Jiyoung Lee
Jin-Hwa Kim
Jung-Woo Ha
Jun-Yan Zhu
DiffM
281
181
0
24 Aug 2023
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages
International Conference on Learning Representations (ICLR), 2023
Jinyi Hu
Yuan Yao
Chong Wang
Shanonan Wang
Yinxu Pan
...
Yankai Lin
Jiao Xue
Dahai Li
Zhiyuan Liu
Maosong Sun
MLLM
VLM
276
77
0
23 Aug 2023
StoryBench: A Multifaceted Benchmark for Continuous Story Visualization
Neural Information Processing Systems (NeurIPS), 2023
Emanuele Bugliarello
Hernan Moraldo
Ruben Villegas
Mohammad Babaeizadeh
M. Saffar
Han Zhang
D. Erhan
V. Ferrari
Pieter-Jan Kindermans
P. Voigtlaender
VGen
332
15
0
22 Aug 2023
DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment
IEEE International Conference on Computer Vision (ICCV), 2023
Xujie Zhang
Binbin Yang
Michael C. Kampffmeyer
Wenqing Zhang
Shiyue Zhang
Guansong Lu
Liang Lin
Hang Xu
Xiaodan Liang
DiffM
396
17
0
22 Aug 2023
Backdooring Textual Inversion for Concept Censorship
Yutong Wu
Jiehan Zhang
Florian Kerschbaum
Tianwei Zhang
DiffM
269
12
0
21 Aug 2023
SimDA: Simple Diffusion Adapter for Efficient Video Generation
Computer Vision and Pattern Recognition (CVPR), 2023
Zhen Xing
Jingdong Sun
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
DiffM
268
107
0
18 Aug 2023
Edit Temporal-Consistent Videos with Image Diffusion Model
Yuan-Zheng Wang
Yong Li
Xiaoya Zhang
Xin Liu
Anbo Dai
Antoni B. Chan
Zhen Cui
DiffM
260
12
0
17 Aug 2023
Likelihood-Based Text-to-Image Evaluation with Patch-Level Perceptual and Semantic Credit Assignment
Qi Chen
Chaorui Deng
Zixiong Huang
Bowen Zhang
Zhuliang Yu
Qi Wu
EGVM
205
0
0
16 Aug 2023
Painter: Teaching Auto-regressive Language Models to Draw Sketches
Reza Pourreza
Apratim Bhattacharyya
Sunny Panchal
Mingu Lee
Pulkit Madan
Roland Memisevic
174
6
0
16 Aug 2023
Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation
ACM Multimedia (ACM MM), 2023
Alexander Martin
Haitian Zheng
Jie An
Jiebo Luo
VLM
DiffM
173
1
0
14 Aug 2023
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation
Computer Vision and Pattern Recognition (CVPR), 2023
Sadeep Jayasumana
Daniel Glasner
Srikumar Ramalingam
Andreas Veit
Ayan Chakrabarti
Surinder Kumar
DiffM
284
0
0
14 Aug 2023
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Hu Ye
Jun Zhang
Siyi Liu
Xiao Han
Wei Yang
DiffM
323
1,264
0
13 Aug 2023
White-box Membership Inference Attacks against Diffusion Models
Proceedings on Privacy Enhancing Technologies (PoPETs), 2023
Yan Pang
Tianhao Wang
Xu Kang
Mengdi Huai
Yang Zhang
AAML
DiffM
289
38
0
11 Aug 2023
LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
ACM Multimedia (ACM MM), 2023
Leigang Qu
Shengqiong Wu
Hao Fei
Liqiang Nie
Tat-Seng Chua
LM&Ro
DiffM
MLLM
324
130
0
09 Aug 2023
Circumventing Concept Erasure Methods For Text-to-Image Generative Models
International Conference on Learning Representations (ICLR), 2023
Minh Pham
Kelly O. Marshall
Niv Cohen
Govind Mittal
Chinmay Hegde
DiffM
238
67
0
03 Aug 2023
Guiding Image Captioning Models Toward More Specific Captions
IEEE International Conference on Computer Vision (ICCV), 2023
Simon Kornblith
Lala Li
Zirui Wang
Thao Nguyen
320
21
0
31 Jul 2023
Visual Instruction Inversion: Image Editing via Visual Prompting
Thao Nguyen
Yuheng Li
Utkarsh Ojha
Yong Jae Lee
DiffM
142
31
0
26 Jul 2023
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
IEEE International Conference on Computer Vision (ICCV), 2023
Shilin Lu
Yanzhu Liu
A. Kong
622
185
0
24 Jul 2023
Divide & Bind Your Attention for Improved Generative Semantic Nursing
British Machine Vision Conference (BMVC), 2023
Yumeng Li
Margret Keuper
Dan Zhang
Anna Khoreva
DiffM
339
77
0
20 Jul 2023
Text2Layer: Layered Image Generation using Latent Diffusion Model
Xinyang Zhang
Wentian Zhao
Xin Lu
J. Chien
DiffM
193
27
0
19 Jul 2023
Beyond the ML Model: Applying Safety Engineering Frameworks to Text-to-Image Development
AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023
Shalaleh Rismani
Renee Shelby
A. Smart
Renelito Delos Santos
AJung Moon
Negar Rostamzadeh
228
13
0
19 Jul 2023
Complexity Matters: Rethinking the Latent Space for Generative Modeling
Neural Information Processing Systems (NeurIPS), 2023
Tianyang Hu
Fei Chen
Hong Wang
Jiawei Li
Wei Cao
Jiacheng Sun
Hao Sun
DiffM
320
17
0
17 Jul 2023
Zero-Shot Image Harmonization with Generative Model Prior
IEEE transactions on multimedia (IEEE TMM), 2023
Jianqi Chen
Yilan Zhang
Zhengxia Zou
Keyan Chen
Z. Shi
DiffM
302
9
0
17 Jul 2023
Can Pre-Trained Text-to-Image Models Generate Visual Goals for Reinforcement Learning?
Neural Information Processing Systems (NeurIPS), 2023
Jialu Gao
Kaizhe Hu
Guowei Xu
Huazhe Xu
LM&Ro
197
21
0
15 Jul 2023
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
Computer Vision and Pattern Recognition (CVPR), 2023
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Wei Wei
Tingbo Hou
Yael Pritch
Neal Wadhwa
Michael Rubinstein
Kfir Aberman
DiffM
212
228
0
13 Jul 2023
Emu: Generative Pretraining in Multimodality
International Conference on Learning Representations (ICLR), 2023
Quan-Sen Sun
Qiying Yu
Yufeng Cui
Fan Zhang
Xiaosong Zhang
Yueze Wang
Hongcheng Gao
Jingjing Liu
Tiejun Huang
Xinlong Wang
MLLM
359
155
0
11 Jul 2023
Diffusion idea exploration for art generation
N. Verma
DiffM
225
1
0
11 Jul 2023
Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback
Neural Information Processing Systems (NeurIPS), 2023
Jaskirat Singh
Liang Zheng
299
37
0
10 Jul 2023
DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer
Dan Ruta
Gemma Canet Tarrés
Andrew Gilbert
Eli Shechtman
Nicholas I. Kolkin
John Collomosse
DiffM
348
8
0
09 Jul 2023
Text-Guided Synthesis of Eulerian Cinemagraphs
ACM Transactions on Graphics (TOG), 2023
Aniruddha Mahapatra
Aliaksandr Siarohin
Hsin-Ying Lee
Sergey Tulyakov
Sitong Su
DiffM
VGen
206
24
0
06 Jul 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
International Conference on Learning Representations (ICLR), 2023
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
1.7K
3,833
0
04 Jul 2023
Previous
1
2
3
...
13
14
15
...
19
20
21
Next
Page 14 of 21
Page
of 21
Go