Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 1,010 papers shown
Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception
Phillip Mueller
Lars Mikelsons
AI4CE
390
5
0
15 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
213
45
0
10 Jul 2024
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
Wanggui He
Siming Fu
Mushui Liu
Xierui Wang
Wenyi Xiao
...
Zhelun Yu
Haoyuan Li
Ziwei Huang
Yaoyao Yu
Hao Jiang
DiffM
238
42
0
10 Jul 2024
ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction
Shaozhe Hao
Kai Han
Zhengyao Lv
Shihao Zhao
Kwan-Yee K. Wong
DiffM
CoGe
341
13
0
09 Jul 2024
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Guian Fang
Wenbiao Yan
Yuanfan Guo
J. N. Han
Zutao Jiang
Hang Xu
Shengcai Liao
Xiaodan Liang
207
12
0
09 Jul 2024
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Fanyue Wei
Wei Zeng
Zhenyang Li
Dawei Yin
Lixin Duan
Wen Li
EGVM
245
10
0
09 Jul 2024
An Improved Method for Personalizing Diffusion Models
Yan Zeng
Masanori Suganuma
Takayuki Okatani
DiffM
217
1
0
07 Jul 2024
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Yilun Xu
Gabriele Corso
Tommi Jaakkola
Arash Vahdat
Karsten Kreis
318
22
0
03 Jul 2024
Meta 3D Gen
Raphael Bensadoun
Tom Monnier
Yanir Kleiman
Filippos Kokkinos
Yawar Siddiqui
...
Antoine Toisoul
David Novotny
Oran Gafni
Natalia Neverova
Andrea Vedaldi
242
4
0
02 Jul 2024
Magic Insert: Style-Aware Drag-and-Drop
Nataniel Ruiz
Yuanzhen Li
Neal Wadhwa
Yael Pritch
Michael Rubinstein
David E. Jacobs
Shlomi Fruchter
DiffM
285
16
0
02 Jul 2024
Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D Objects
Raphael Bensadoun
Yanir Kleiman
Idan Azuri
Omri Harosh
Andrea Vedaldi
Natalia Neverova
Oran Gafni
282
44
0
02 Jul 2024
SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules
Suyi Li
Lingyun Yang
Xiaoxiao Jiang
Hanfeng Lu
Zhipeng Di
...
Tao Lan
Guodong Yang
Lin Qu
Liping Zhang
Wei Wang
167
6
0
02 Jul 2024
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Seyedmorteza Sadat
Manuel Kansy
Otmar Hilliges
Romann M. Weber
374
37
0
02 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Yuchen Ren
Fan Ma
Zongxin Yang
Yue Yang
411
28
0
02 Jul 2024
Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
Core Francisco Park
Maya Okawa
Andrew Lee
Ekdeep Singh Lubana
Hidenori Tanaka
343
28
0
27 Jun 2024
On Discrete Prompt Optimization for Diffusion Models
Ruochen Wang
Ting Liu
Cho-Jui Hsieh
Boqing Gong
DiffM
269
20
0
27 Jun 2024
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
William Berman
A. Peysakhovich
280
5
0
26 Jun 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan
Jinfa Huang
Yongqi Xu
Yaoyang Liu
Shaofeng Zhang
Yujun Shi
Ruijie Zhu
Xinhua Cheng
Jiebo Luo
Li Yuan
EGVM
VGen
368
53
0
26 Jun 2024
Aligning Diffusion Models with Noise-Conditioned Perception
Alexander Gambashidze
Anton Kulikov
Yuriy Sosnin
Ilya Makarov
325
10
0
25 Jun 2024
Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation
Katherine M. Collins
Najoung Kim
Yonatan Bitton
Verena Rieser
Shayegan Omidshafiei
...
Gang Li
Adrian Weller
Junfeng He
Deepak Ramachandran
Krishnamurthy Dvijotham
EGVM
188
3
0
24 Jun 2024
EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models
Zhiyu Tan
Xiaomeng Yang
Luozheng Qin
Mengping Yang
Cheng Zhang
Hao Li
385
13
0
24 Jun 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
459
90
0
24 Jun 2024
Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification
Honori Udo
Takafumi Koshinaka
VLM
184
0
0
22 Jun 2024
MetaGreen: Meta-Learning Inspired Transformer Selection for Green Semantic Communication
Shubhabrata Mukherjee
Cory Beard
Sejun Song
153
0
0
22 Jun 2024
Evaluating Numerical Reasoning in Text-to-Image Models
Ivana Kajić
Olivia Wiles
Isabela Albuquerque
Matthias Bauer
Su Wang
Jordi Pont-Tuset
Aida Nematzadeh
EGVM
ReLM
457
8
0
20 Jun 2024
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Baiqi Li
Zhiqiu Lin
Deepak Pathak
Jiayao Li
Yixin Fei
...
Tiffany Ling
Xide Xia
Pengchuan Zhang
Graham Neubig
Deva Ramanan
EGVM
362
70
0
19 Jun 2024
Leveraging Large Language Models for Patient Engagement: The Power of Conversational AI in Digital Health
International Conference on Digital Health (ICDH), 2024
Bo Wen
R. Norel
Julia Liu
Thaddeus Stappenbeck
F. Zulkernine
Huamin Chen
AI4MH
LM&MA
203
17
0
19 Jun 2024
ARTIST: Improving the Generation of Text-rich Images by Disentanglement
Jianyi Zhang
Jiuxiang Gu
Jiuxiang Gu
Curtis Wigington
Tong Yu
Yiran Chen
Tong Sun
Ruiyi Zhang
255
1
0
17 Jun 2024
Large Scale Transfer Learning for Tabular Data via Language Modeling
Josh Gardner
Juan C. Perdomo
Ludwig Schmidt
LMTD
277
49
0
17 Jun 2024
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
Alireza Ganjdanesh
Reza Shirkavand
Shangqian Gao
Heng Huang
DiffM
VLM
441
10
0
17 Jun 2024
A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Ming Meng
Yufei Zhao
Bo Zhang
Yonggui Zhu
Weimin Shi
Maxwell Wen
Zhaoxin Fan
VGen
350
6
0
15 Jun 2024
Composing Parts for Expressive Object Generation
Computer Vision and Pattern Recognition (CVPR), 2024
Harsh Rangwani
Aishwarya Agarwal
Kuldeep Kulkarni
R. Venkatesh Babu
Srikrishna Karanam
DiffM
369
2
0
14 Jun 2024
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
Roman Bachmann
Oğuzhan Fatih Kar
David Mizrahi
Ali Garjani
Mingfei Gao
David Griffiths
Jiaming Hu
Afshin Dehghan
Amir Zamir
MoE
VLM
MLLM
271
33
0
13 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
307
81
0
13 Jun 2024
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation
Jiuxiang Gu
Ruiyi Zhang
Kaizhi Zheng
Nanxuan Zhao
Jiuxiang Gu
Zichao Wang
Xin Eric Wang
Tong Sun
DiffM
198
4
0
13 Jun 2024
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation
Weixi Feng
Jiachen Li
Michael Stephen Saxon
Tsu-Jui Fu
Wenhu Chen
William Yang Wang
EGVM
VGen
238
30
0
12 Jun 2024
FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion
George Cazenavette
Avneesh Sud
Thomas Leung
Ben Usman
DiffM
268
52
0
12 Jun 2024
What If We Recaption Billions of Web Images with LLaMA-3?
Xianhang Li
Haoqin Tu
Mude Hui
Zeyu Wang
Bingchen Zhao
...
Jieru Mei
Qing Liu
Huangjie Zheng
Yuyin Zhou
Cihang Xie
VLM
MLLM
289
67
0
12 Jun 2024
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences
Daiwei Chen
Yi Chen
Aniket Rege
Ramya Korlakai Vinayak
294
36
0
12 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
391
187
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
543
540
0
10 Jun 2024
Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization
Yi Gu
Zhendong Wang
Yueqin Yin
Yujia Xie
Mingyuan Zhou
240
30
0
10 Jun 2024
OmniControlNet: Dual-stage Integration for Conditional Image Generation
Yilin Wang
Haiyang Xu
Xiang Zhang
Zeyuan Chen
Zhizhou Sha
Zirui Wang
Zhuowen Tu
VLM
292
22
0
09 Jun 2024
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2024
Zanlin Ni
Yulin Wang
Renping Zhou
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Shiji Song
Yuan Yao
Gao Huang
284
28
0
08 Jun 2024
AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation
Neural Information Processing Systems (NeurIPS), 2024
Lianyu Pang
Jian Yin
Baoquan Zhao
Feize Wu
Fu Lee Wang
Qing Li
Xudong Mao
DiffM
295
7
0
07 Jun 2024
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2024
Sanjoy Chowdhury
Sayan Nag
K. J. Joseph
Balaji Vasan Srinivasan
Dinesh Manocha
DiffM
234
17
0
07 Jun 2024
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Neural Information Processing Systems (NeurIPS), 2024
Yang Sui
Yanyu Li
Vidit Goel
Yerlan Idelbayev
Junli Cao
Ju Hu
Dhritiman Sagar
Bo Yuan
Sergey Tulyakov
Jian Ren
MQ
269
38
0
06 Jun 2024
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
Neural Information Processing Systems (NeurIPS), 2024
L. Eyring
Shyamgopal Karthik
Karsten Roth
Alexey Dosovitskiy
Zeynep Akata
318
67
0
06 Jun 2024
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Marianna Ohanyan
Hayk Manukyan
Zinan Lin
Shant Navasardyan
Humphrey Shi
DiffM
238
10
0
06 Jun 2024
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
Zicheng Zhang
H. Wu
Chunyi Li
Yingjie Zhou
Wei Sun
Xiongkuo Min
Zijian Chen
Xiaohong Liu
Weisi Lin
Guangtao Zhai
EGVM
387
38
0
05 Jun 2024
Previous
1
2
3
...
7
8
9
...
19
20
21
Next