Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 1,010 papers shown
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2024
Boming Miao
Xuefei Liu
Xiaobei Wang
Andi Zhang
Rui Sun
Zizhe Wang
Yao Zhu
DiffM
486
3
0
25 Nov 2024
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
Computer Vision and Pattern Recognition (CVPR), 2024
Yongwei Chen
Yushi Lan
Shangchen Zhou
Tengfei Wang
Xingang Pan
652
20
0
25 Nov 2024
Visual-Word Tokenizer: Beyond Fixed Sets of Tokens in Vision Transformers
Leonidas Gee
Wing Yan Li
V. Sharmanska
Novi Quadrianto
ViT
681
0
0
23 Nov 2024
TPIE: Topology-Preserved Image Editing With Text Instructions
Nivetha Jayakumar
Srivardhan Reddy Gadila
Tonmoy Hossain
Yangfeng Ji
Miaomiao Zhang
DiffM
MedIm
442
1
0
22 Nov 2024
Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps
Computer Vision and Pattern Recognition (CVPR), 2024
Jeeyung Kim
Erfan Esmaeili
Qiang Qiu
DiffM
296
5
0
21 Nov 2024
Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body
Computer Vision and Pattern Recognition (CVPR), 2024
Zeqing Wang
Qingyang Ma
Wentao Wan
Haojie Li
Keze Wang
Yonghong Tian
DiffM
246
9
0
21 Nov 2024
How to Defend Against Large-scale Model Poisoning Attacks in Federated Learning: A Vertical Solution
Jinbo Wang
Ruijin Wang
Fengli Zhang
FedML
AAML
248
0
0
16 Nov 2024
GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization
Yanhao Sun
RunZe Tian
Xiao Han
XinYao Liu
Yan Zhang
Kai Xu
3DGS
DiffM
179
6
0
15 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
402
3
0
12 Nov 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Neural Information Processing Systems (NeurIPS), 2024
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Xingtai Lv
Gao Huang
273
10
0
11 Nov 2024
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Zhennan Chen
Yajie Li
Haofan Wang
Zheyu Chen
Zhengkai Jiang
Jun Yu Li
Qian Wang
Zhiqiang Wang
Ying Tai
DiffM
358
25
0
10 Nov 2024
Hardware-Friendly Diffusion Models with Fixed-Size Reusable Structures for On-Device Image Generation
Sanchar Palit
Sathya Veera Reddy Dendi
Mallikarjuna Talluri
Raj Narayana Gadde
247
0
0
09 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
494
38
0
08 Nov 2024
Clustering in Causal Attention Masking
Neural Information Processing Systems (NeurIPS), 2024
Nikita Karagodin
Yury Polyanskiy
Philippe Rigollet
318
20
0
07 Nov 2024
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Neural Information Processing Systems (NeurIPS), 2024
Vidit Goel
Huseyin Coskun
Jierun Chen
Junli Cao
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
Jian Ren
247
6
0
07 Nov 2024
DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric Finetuning
Neural Information Processing Systems (NeurIPS), 2024
Yuxuan Duan
Y. Hong
Bo Zhang
Jun Lan
Huijia Zhu
Weiqiang Wang
Jianfu Zhang
Li Niu
Guang Dai
DiffM
234
2
0
07 Nov 2024
Image Understanding Makes for A Good Tokenizer for Image Generation
Neural Information Processing Systems (NeurIPS), 2024
Luting Wang
Yang Zhao
Zijian Zhang
Jiashi Feng
Si Liu
Bingyi Kang
VLM
203
9
0
07 Nov 2024
ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Ashutosh Srivastava
Tarun Ram Menta
Abhinav Java
Avadhoot Jadhav
Silky Singh
Surgan Jandial
Balaji Krishnamurthy
DiffM
225
3
0
06 Nov 2024
SEE-DPO: Self Entropy Enhanced Direct Preference Optimization
Shivanshu Shekhar
Shreyas Singh
Tong Zhang
270
6
0
06 Nov 2024
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Yongxin Zhu
Bing Li
Yifei Xin
Zhihua Xia
Linli Xu
525
42
0
04 Nov 2024
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
326
84
1
01 Nov 2024
Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Shenghao Xie
Wenqiang Zu
Mingyang Zhao
Duo Su
Shilong Liu
Ruohua Shi
Guoqi Li
Shanghang Zhang
Lei Ma
LRM
449
11
0
29 Oct 2024
Benchmarking Human and Automated Prompting in the Segment Anything Model
BigData Congress [Services Society] (BSS), 2024
Jorge Quesada
Zoe Fowler
Mohammad Alotaibi
Mohit Prabhushankar
Ghassan AlRegib
VLM
225
4
0
29 Oct 2024
MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding
Yuan Wang
Di Huang
Yaqi Zhang
Wanli Ouyang
J. Jiao
Xuetao Feng
Yan Zhou
Pengfei Wan
Weizhen He
Dan Xu
VGen
245
37
0
29 Oct 2024
Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis
Neural Information Processing Systems (NeurIPS), 2024
Deepak Sridhar
Abhishek Peri
Rohith Rachala
Nuno Vasconcelos
DiffM
241
2
0
29 Oct 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
225
4
0
28 Oct 2024
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Emiel Hoogeboom
Thomas Mensink
Jonathan Heek
Kay Lamerigts
Ruiqi Gao
Tim Salimans
1.0K
38
0
25 Oct 2024
FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation
Neural Information Processing Systems (NeurIPS), 2024
Christopher T. H. Teo
Milad Abdollahzadeh
Xinda Ma
Ngai-Man Cheung
DiffM
281
3
0
24 Oct 2024
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
International Conference on Learning Representations (ICLR), 2024
Shilin Lu
Zihan Zhou
Jiayou Lu
Yuanzhi Zhu
A. Kong
WIGM
564
79
0
24 Oct 2024
Fast constrained sampling in pre-trained diffusion models
Alexandros Graikos
Nebojsa Jojic
Dimitris Samaras
DiffM
398
2
0
24 Oct 2024
How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Neural Information Processing Systems (NeurIPS), 2024
Jiahua Dong
Wenqi Liang
Hongliu Li
Duzhen Zhang
Meng Cao
Henghui Ding
Salman Khan
Fahad Shahbaz Khan
DiffM
232
27
0
23 Oct 2024
DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
Neural Information Processing Systems (NeurIPS), 2024
Haowei Zhu
Dehua Tang
Ji Liu
Mingjie Lu
Jintu Zheng
...
Spandan Tiwari
Ashish Sirasao
Jun-Hai Yong
Bin Wang
E. Barsoum
DiffM
168
24
0
22 Oct 2024
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Xingtai Lv
VLM
199
6
0
21 Oct 2024
Opportunities and Challenges of Generative-AI in Finance
BigData Congress [Services Society] (BSS), 2024
Akshar Prabhu Desai
Ganesh Satish Mallya
Mohammad Luqman
Tejasvi Ravi
Nithya Kota
Pranjul Yadav
AIFin
445
11
0
21 Oct 2024
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
International Conference on Learning Representations (ICLR), 2024
Shaozhe Hao
Xuantong Liu
Xianbiao Qi
Shihao Zhao
Bojia Zi
Rong Xiao
Kai Han
Kwan-Yee K. Wong
546
4
0
18 Oct 2024
Assessing Open-world Forgetting in Generative Image Model Customization
Héctor Laria
Alex Gomez-Villa
Imad Eddine Marouf
Bogdan Raducanu
Bogdan Raducanu
VLM
DiffM
301
1
0
18 Oct 2024
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
International Conference on Learning Representations (ICLR), 2024
Lijie Fan
Tianhong Li
Siyang Qin
Yuanzhen Li
Chen Sun
Michael Rubinstein
Deqing Sun
Kaiming He
Yonglong Tian
VLM
DiffM
330
110
0
17 Oct 2024
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Donghao Zhou
Jiancheng Huang
J. Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Ge Liu
439
12
0
17 Oct 2024
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Neural Information Processing Systems (NeurIPS), 2024
Zhi Wang
Li Zhang
Wenhao Wu
Yuanheng Zhu
Dongbin Zhao
C. L. Philip Chen
OffRL
261
16
0
15 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
226
16
0
14 Oct 2024
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
International Conference on Learning Representations (ICLR), 2024
Xiangru Zhu
Yixiang Chen
Yaoxian Song
Yanghua Xiao
Zhixu Li
Chengyu Wang
Jun Huang
Bei Yang
Xiaoxiao Xu
EGVM
1.1K
2
0
14 Oct 2024
Generating Intermediate Representations for Compositional Text-To-Image Generation
Ran Galun
Sagie Benaim
217
0
0
13 Oct 2024
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
International Conference on Learning Representations (ICLR), 2024
Huayu Chen
Hang Su
Peize Sun
Jun Zhu
VLM
219
10
0
12 Oct 2024
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
Jiatao Gu
Yuyang Wang
Yizhe Zhang
Qihang Zhang
Dinghuai Zhang
Navdeep Jaitly
Josh Susskind
Shuangfei Zhai
DiffM
377
27
0
10 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Hefei Ling
Zhen Dong
Lei Zhu
525
43
0
10 Oct 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Neural Information Processing Systems (NeurIPS), 2024
Rui Zhao
Hangjie Yuan
Yujie Wei
Shiwei Zhang
Yuchao Gu
...
Xiang Wang
Zhangjie Wu
Junhao Zhang
Yingya Zhang
Mike Zheng Shou
DiffM
VLM
328
7
0
09 Oct 2024
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
International Conference on Learning Representations (ICLR), 2024
Fu-Yun Wang
Ling Yang
Zhaoyang Huang
Mengdi Wang
Hongsheng Li
249
46
0
09 Oct 2024
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
Computer Vision and Pattern Recognition (CVPR), 2024
Qianli Ma
Xuefei Ning
Dongrui Liu
Li Niu
Linfeng Zhang
MoMe
301
2
0
09 Oct 2024
Diversity-Rewarded CFG Distillation
International Conference on Learning Representations (ICLR), 2024
Geoffrey Cideron
A. Agostinelli
Johan Ferret
Sertan Girgin
Romuald Elie
Olivier Bachem
Sarah Perrin
Alexandre Ramé
245
5
0
08 Oct 2024
Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models
Theo Putterman
Derek Lim
Yoav Gelberg
Stefanie Jegelka
Haggai Maron
AI4CE
277
12
0
05 Oct 2024
Previous
1
2
3
...
5
6
7
...
19
20
21
Next