Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 1,010 papers shown
DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing
Zihan Zhou
Shilin Lu
Shuli Leng
Shaocong Zhang
Zhuming Lian
Xinlei Yu
A. Kong
DiffM
306
7
0
02 Oct 2025
JEPA-T: Joint-Embedding Predictive Architecture with Text Fusion for Image Generation
Siheng Wan
Zhengtao Yao
Zhengdao Li
Junhao Dong
Yanshu Li
...
Haoyan Xu
Yijiang Li
Zhikang Dong
Huacan Wang
Jifeng Shen
DiffM
104
0
0
01 Oct 2025
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
Jiayi Guo
Chuanhao Yan
Xingqian Xu
Yulin Wang
Kai Wang
Gao Huang
Humphrey Shi
143
1
0
30 Sep 2025
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
Keming Wu
Sicong Jiang
Max Ku
Ping Nie
Minghao Liu
Wenhu Chen
116
9
0
30 Sep 2025
EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model
Ruixiao Dong
Z. Wang
Keli Liu
Li Li
Ying Chen
Kai Li
Daowen Li
Houqiang Li
DiffM
VGen
142
0
0
30 Sep 2025
Free Lunch Alignment of Text-to-Image Diffusion Models without Preference Image Pairs
Jia Jun Cheng Xian
Muchen Li
Haotian Yang
Xin Tao
Pengfei Wan
Leonid Sigal
Renjie Liao
133
1
0
30 Sep 2025
Go with Your Gut: Scaling Confidence for Autoregressive Image Generation
Harold Haodong Chen
Xianfeng Wu
Wen-Jie Shu
Rongjin Guo
Disen Lan
Harry Yang
Ying-Cong Chen
136
1
0
30 Sep 2025
Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models
Bowei Chen
Sai Bi
Hao Tan
Chentao Song
Tianyuan Zhang
Zhengqi Li
Yuanjun Xiong
Jianming Zhang
Kai Zhang
212
4
0
29 Sep 2025
STAGE: Stable and Generalizable GRPO for Autoregressive Image Generation
Xiaoxiao Ma
Haibo Qiu
Guohui Zhang
Zhixiong Zeng
Siqi Yang
Lin Ma
Feng Zhao
122
4
0
29 Sep 2025
GLASS Flows: Transition Sampling for Alignment of Flow and Diffusion Models
Peter Holderrieth
Uriel Singer
Tommi Jaakkola
Ricky T. Q. Chen
Y. Lipman
Brian Karrer
DiffM
177
1
0
29 Sep 2025
Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models
Longtao Jiang
Mingfei Han
Lei Chen
Yongqiang Yu
Feng Zhao
Feng Zhao
Xiaojun Chang
Zhihui Li
DiffM
116
0
0
28 Sep 2025
Towards Fine-Grained Text-to-3D Quality Assessment: A Benchmark and A Two-Stage Rank-Learning Metric
Bingyang Cui
Yujie Zhang
Qi Yang
Zhu Li
Yiling Xu
234
0
0
28 Sep 2025
HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models
Seyedmorteza Sadat
Farnood Salehi
Romann M. Weber
DiffM
164
0
0
26 Sep 2025
Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization
Takashi Morita
MQ
178
0
0
26 Sep 2025
SD3.5-Flash: Distribution-Guided Distillation of Generative Flows
Hmrishav Bandyopadhyay
Rahim Entezari
Jim Scott
Reshinth Adithyan
Yi-Zhe Song
Varun Jampani
313
2
0
25 Sep 2025
LayoutAgent: A Vision-Language Agent Guided Compositional Diffusion for Spatial Layout Planning
Zezhong Fan
Xiaohan Li
Luyi Ma
Kai Zhao
Liang Peng
Topojoy Biswas
Evren Körpeoglu
Kaushiki Nag
Kannan Achan
DiffM
174
0
0
24 Sep 2025
MEF: A Systematic Evaluation Framework for Text-to-Image Models
Xiaojing Dong
Weilin Huang
Liang Li
Y. Li
Shu Liu
Tongtong Ou
Shuang Ouyang
Yu Tian
Fengxuan Zhao
EGVM
158
0
0
22 Sep 2025
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Yanghao Li
Rui Qian
Bowen Pan
Haotian Zhang
Haoshuo Huang
...
Zhengdong Zhang
Chen Chen
Yang Zhao
Ruoming Pang
Zhifeng Chen
MLLM
205
4
0
19 Sep 2025
PolyJuice Makes It Real: Black-Box, Universal Red Teaming for Synthetic Image Detectors
Sepehr Dehdashtian
Mashrur M. Morshed
Jacob H. Seidman
Gaurav Bharaj
Vishnu Boddeti
AAML
DiffM
192
0
0
19 Sep 2025
Maestro: Self-Improving Text-to-Image Generation via Agent Orchestration
Xingchen Wan
Han Zhou
Ruoxi Sun
Hootan Nakhost
Ke Jiang
Rajarishi Sinha
Sercan Ö. Arık
246
4
0
12 Sep 2025
FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark
Rongyao Fang
Aldrich Yu
Chengqi Duan
Linjiang Huang
S. Bai
Yuxuan Cai
Kun Wang
Si Liu
Xihui Liu
Xue Yang
EGVM
VGen
ReLM
LRM
230
14
0
11 Sep 2025
Discovering Divergent Representations between Text-to-Image Models
Lisa Dunlap
Joseph E. Gonzalez
Trevor Darrell
Fabian Caba Heilbron
Josef Sivic
Bryan C. Russell
EGVM
126
0
0
10 Sep 2025
SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models
K. Nguyen
Anh Tran
Cuong Pham
124
0
0
06 Sep 2025
FICGen: Frequency-Inspired Contextual Disentanglement for Layout-driven Degraded Image Generation
Wenzhuang Wang
Yifan Zhao
Mingcan Ma
Ming-Yuan Liu
Zhonglin Jiang
Yong Chen
Jia Li
DiffM
142
1
0
01 Sep 2025
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
Kaiyue Sun
Rongyao Fang
Chengqi Duan
Xian Liu
Xihui Liu
167
14
0
24 Aug 2025
Single-Reference Text-to-Image Manipulation with Dual Contrastive Denoising Score
Syed Muhmmad Israr
Feng Zhao
DiffM
152
0
0
18 Aug 2025
DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models
Xiaochuan Lin
Xiangyong Chen
Xuan Li
Yichen Su
MLLM
VLM
149
0
0
17 Aug 2025
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale
NextStep Team
Chunrui Han
Guopeng Li
J. Wu
Quan Sun
...
Ziyang Meng
Binxing Jiao
Daxin Jiang
X. Zhang
Yibo Zhu
DiffM
201
22
0
14 Aug 2025
OneVAE: Joint Discrete and Continuous Optimization Helps Discrete Video VAE Train Better
Yupeng Zhou
Zhen Li
Ziheng Ouyang
Yuming Chen
Ruoyi Du
...
Bin Fu
Yihao Liu
Peng Gao
Ming-Ming Cheng
Qibin Hou
204
1
0
13 Aug 2025
Spatial-Temporal Multi-Scale Quantization for Flexible Motion Generation
Zan Wang
Jingze Zhang
Yixin Chen
Baoxiong Jia
Wei Liang
Siyuan Huang
MQ
159
1
0
12 Aug 2025
Per-Query Visual Concept Learning
Ori Malca
Dvir Samuel
Gal Chechik
DiffM
VLM
114
0
0
12 Aug 2025
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
Joonghyuk Shin
Alchan Hwang
Yujin Kim
Daneul Kim
Jaesik Park
DiffM
121
4
0
11 Aug 2025
Grouped Speculative Decoding for Autoregressive Image Generation
Junhyuk So
Juncheol Shin
Hyunho Kook
Eunhyeok Park
DiffM
100
3
0
11 Aug 2025
Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers
Xin Ma
Yaohui Wang
Genyun Jia
Xinyuan Chen
Tien-Tsin Wong
C. L. P. Chen
VGen
160
0
0
10 Aug 2025
AR-GRPO: Training Autoregressive Image Generation Models via Reinforcement Learning
Shihao Yuan
Yahui Liu
Yang Yue
Jingyuan Zhang
Wangmeng Zuo
Qi Wang
Fuzheng Zhang
Guorui Zhou
EGVM
VLM
148
11
0
09 Aug 2025
NEP: Autoregressive Image Editing via Next Editing Token Prediction
Huimin Wu
Xiaojian Ma
Haozhe Zhao
Yanpeng Zhao
Qing Li
DiffM
144
2
0
08 Aug 2025
Towards Robust Red-Green Watermarking for Autoregressive Image Generators
Denis Lukovnikov
Andreas Müller
Erwin Quiring
Asja Fischer
WIGM
215
0
0
08 Aug 2025
Zero-Residual Concept Erasure via Progressive Alignment in Text-to-Image Model
Hongxu Chen
Zhen Wang
Taoran Mei
Lin Li
Bowei Zhu
Runshi Li
L. Chen
DiffM
163
0
0
06 Aug 2025
AuthPrint: Fingerprinting Generative Models Against Malicious Model Providers
Kai Yao
Marc Juarez
WIGM
301
2
0
06 Aug 2025
Diffusion Models with Adaptive Negative Sampling Without External Resources
Alakh Desai
Nuno Vasconcelos
DiffM
162
0
0
05 Aug 2025
LumiGen: An LVLM-Enhanced Iterative Framework for Fine-Grained Text-to-Image Generation
Xiaoqi Dong
Xiangyu Zhou
Nicholas Evans
Yujia Lin
MLLM
103
0
0
05 Aug 2025
ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation
Cihang Peng
Qiming Hou
Zhong Ren
Kun Zhou
ObjD
159
0
0
01 Aug 2025
Steering Guidance for Personalized Text-to-Image Diffusion Models
S. Park
Seokeon Choi
Hyoungwoo Park
Sungrack Yun
195
1
0
01 Aug 2025
LLMControl: Grounded Control of Text-to-Image Diffusion-based Synthesis with Multimodal LLMs
Jiaze Wang
Rui Chen
Haowang Cui
181
0
0
26 Jul 2025
Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment
Ying Ba
Tianyu Zhang
Yalong Bai
Wenyi Mo
Tao Liang
Bing Su
Ji-Rong Wen
EGVM
240
6
0
25 Jul 2025
Identifying Prompted Artist Names from Generated Images
Grace Su
Sheng-Yu Wang
Aaron Hertzmann
Eli Shechtman
Jun-Yan Zhu
Richard Zhang
VLM
174
0
0
24 Jul 2025
Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis
Yanzuo Lu
Yuxi Ren
Xin Xia
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Andy J. Ma
Xiaohua Xie
Jian-Huang Lai
DiffM
272
11
0
24 Jul 2025
Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Yi Xin
Juncheng Yan
Qi Qin
Ge Wang
Dongyang Liu
...
Jiaming Song
Guangtao Zhai
Xiaohong Liu
Botian Shi
Peng Gao
208
24
0
23 Jul 2025
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation
Anlin Zheng
Xin Wen
Xuanyang Zhang
Chuofan Ma
Tiancai Wang
Gang Yu
Xiangyu Zhang
Xiaojuan Qi
VLM
191
4
0
11 Jul 2025
Divergence Minimization Preference Optimization for Diffusion Model Alignment
Binxu Li
Minkai Xu
Jiaqi Han
Meihua Dang
Stefano Ermon
EGVM
269
2
0
10 Jul 2025
Previous
1
2
3
4
5
...
19
20
21
Next