ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXiv (abs)PDFHTMLHuggingFace (4 upvotes)

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 1,010 papers shown
A Training-Free Style-Personalization via SVD-Based Feature Decomposition
A Training-Free Style-Personalization via SVD-Based Feature Decomposition
Kyoungmin Lee
Jihun Park
Jongmin Gim
Wonhyeok Choi
K. Hwang
Jaeyeul Kim
Sunghoon Im
DiffM
156
0
0
06 Jul 2025
CooT: Learning to Coordinate In-Context with Coordination Transformers
CooT: Learning to Coordinate In-Context with Coordination Transformers
Huai-Chih Wang
Hsiang-Chun Chuang
Hsi-Chun Cheng
Dai-Jie Wu
Shao-Hua Sun
OffRL
153
0
0
30 Jun 2025
How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions
How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions
Manuel Brack
Sudeep Katakol
Felix Friedrich
P. Schramowski
Hareesh Ravi
Kristian Kersting
Ajinkya Kale
178
1
0
20 Jun 2025
Reward-Agnostic Prompt Optimization for Text-to-Image Diffusion Models
Reward-Agnostic Prompt Optimization for Text-to-Image Diffusion Models
Semin Kim
Yeonwoo Cha
Jaehoon Yoo
Seunghoon Hong
EGVM
238
3
0
20 Jun 2025
Watermarking Autoregressive Image Generation
Watermarking Autoregressive Image Generation
Nikola Jovanović
Ismail Labiad
Tomáš Souček
Martin Vechev
Pierre Fernandez
WIGM
447
3
0
19 Jun 2025
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Anirud Aggarwal
Abhinav Shrivastava
M. Gwilliam
415
0
0
18 Jun 2025
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
Black Forest Labs
Stephen Batifol
A. Blattmann
Frederic Boesel
Saksham Consul
...
Dustin Podell
Robin Rombach
Harry Saini
Axel Sauer
Luke Smith
DiffM
352
343
0
17 Jun 2025
ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection
ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection
Shang-Chi Tsai
Seiya Kawano
Angel García Contreras
Koichiro Yoshino
Yun-Nung Chen
LM&Ro
247
2
0
16 Jun 2025
SpectralAR: Spectral Autoregressive Visual Generation
SpectralAR: Spectral Autoregressive Visual Generation
Yuanhui Huang
Weiliang Chen
Wenzhao Zheng
Yueqi Duan
Jie Zhou
Jiwen Lu
DiffMVGen
296
5
0
12 Jun 2025
LeVo: High-Quality Song Generation with Multi-Preference Alignment
LeVo: High-Quality Song Generation with Multi-Preference Alignment
Shun Lei
Yaoxun Xu
Zhiwei Lin
Huaicheng Zhang
Wei Tan
...
Chenyu Yang
Haina Zhu
Shuai Wang
Zhiyong Wu
Dong Yu
279
14
0
09 Jun 2025
CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems
Aniket Rege
Zinnia Nie
Mahesh Ramesh
Unmesh Raskar
Zhuoran Yu
Aditya Kusupati
Yong Jae Lee
Ramya Korlakai Vinayak
260
4
0
09 Jun 2025
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Jingjing Chang
Yixiao Fang
Peng Xing
Shuhan Wu
Wei Cheng
Rui Wang
Xianfang Zeng
Gang Yu
H. Chen
EGVMVLM
450
21
0
09 Jun 2025
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Jiatao Gu
Tianrong Chen
David Berthelot
Huangjie Zheng
Yuyang Wang
Ruixiang Zhang
Laurent Dinh
Miguel Angel Bautista
Josh Susskind
Shuangfei Zhai
246
13
0
06 Jun 2025
Improving AI-generated music with user-guided training
Vishwa Mohan Singh
Sai Anirudh Aryasomayajula
Ahan Chatterjee
Beste Aydemir
Rifat Mehreen Amin
203
0
0
05 Jun 2025
HMAR: Efficient Hierarchical Masked Auto-Regressive Image GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Hermann Kumbong
Xian Liu
Tsung-Yi Lin
Ming-Yu Liu
Xihui Liu
Ziwei Liu
Daniel Y. Fu
Christopher Ré
David W. Romero
DiffM
217
8
0
04 Jun 2025
How Far Are We from Generating Missing Modalities with Foundation Models?
How Far Are We from Generating Missing Modalities with Foundation Models?
Guanzhou Ke
Yi Xie
Xiaoli Wang
Guoqing Chao
Bo Wang
VLM
303
0
0
04 Jun 2025
Native-Resolution Image Synthesis
Native-Resolution Image Synthesis
Zidong Wang
Mengwei He
Xiangyu Yue
Xuming He
Yiyuan Zhang
315
3
0
03 Jun 2025
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models
Mingzhe Li
Gehao Zhang
Zhenting Wang
Guanhong Tao
Siqi Pan
Richard Cartwright
Juan Zhai
Shiqing Ma
DiffM
236
0
0
03 Jun 2025
Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models
Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models
Hongtao Huang
Xiaojun Chang
Weitong Chen
MedIm
307
0
0
03 Jun 2025
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Yunhong Lu
Qichao Wang
H. Cao
Xiaoyin Xu
Min Zhang
332
5
0
03 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
225
2
0
02 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
412
7
0
02 Jun 2025
One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models
One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2025
S. Li
Lei Wang
Kai Wang
Tao Liu
J. Xie
Joost van de Weijer
Fahad Shahbaz Khan
Shiqi Yang
Yaxing Wang
Zhiqiang Wang
259
4
0
28 May 2025
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
Hang Hua
Ziyun Zeng
Yizhi Song
Yunlong Tang
Liu He
Daniel G. Aliaga
Wei Xiong
Jiebo Luo
EGVM
394
2
0
26 May 2025
Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling
Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling
Junhong Lee
Seungwook Kim
Minsu Cho
DiffM
285
0
0
26 May 2025
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
Jiru Deng
Tengjin Weng
Tianyu Yang
Tong Lu
Zhiheng Li
Wenhao Jiang
VLM
364
0
0
26 May 2025
Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation
Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation
Wenchao Zhang
Jiahe Tian
Runze He
Jizhong Han
Jiao Dai
Miaomiao Feng
Wei Mi
Xiaodan Zhang
271
0
0
24 May 2025
Rethinking Direct Preference Optimization in Diffusion Models
Rethinking Direct Preference Optimization in Diffusion Models
Junyong Kang
Seohyun Lim
Kyungjune Baek
Hyunjung Shim
1.0K
0
0
24 May 2025
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
Yanting Miao
William Loh
Suraj Kothawade
Pacal Poupart
268
0
0
23 May 2025
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Chaoyang Wang
Xiangtai Li
Lu Qi
X. Lin
Jinbin Bai
Qianyu Zhou
Yunhai Tong
DiffM
322
3
0
22 May 2025
DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?
DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?
Qirui Jiao
Daoyuan Chen
Yilun Huang
Xika Lin
Ying Shen
Yaliang Li
VLM
223
2
0
22 May 2025
MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention
MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention
Chaoyi Jiang
Sungwoo Kim
Lei Gao
Hossein Entezari Zarch
Won Woo Ro
Murali Annavaram
222
0
0
22 May 2025
Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Sifan Li
Ming Tao
Hao Zhao
Ling Shao
Hao Tang
DiffM
359
0
0
20 May 2025
MSDformer: Multi-scale Discrete Transformer For Time Series Generation
MSDformer: Multi-scale Discrete Transformer For Time Series Generation
Zhicheng Chen
Shibo Feng
Xi Xiao
Zhong Zhang
Qing Li
Xingyu Gao
Peilin Zhao
249
2
0
20 May 2025
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
Yilin Ye
Junchao Huang
Xingchen Zeng
Jiazhi Xia
Wei Zeng
406
0
0
20 May 2025
Few-Step Diffusion via Score identity Distillation
Few-Step Diffusion via Score identity Distillation
Mingyuan Zhou
Yi Gu
Zhendong Wang
340
5
0
19 May 2025
Context-Aware Autoregressive Models for Multi-Conditional Image Generation
Context-Aware Autoregressive Models for Multi-Conditional Image Generation
Yixiao Chen
Zhiyuan Ma
Guoli Jia
Che Jiang
Jianjun Li
Bowen Zhou
DiffM
272
3
0
18 May 2025
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
Jiarui Wang
Huiyu Duan
Ziheng Jia
Yu Zhao
Woo Yi Yang
...
Zhongfu Chen
Juntong Wang
Yuke Xing
Guangtao Zhai
Xiongkuo Min
VGen
456
5
0
17 May 2025
One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework
One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework
Feiran Li
Qianqian Xu
Shilong Bao
Zhiyong Yang
Xiaochun Cao
Qingming Huang
DiffM
532
4
0
16 May 2025
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT AccelerationComputer Vision and Pattern Recognition (CVPR), 2025
Haipeng Fang
Sheng Tang
Juan Cao
Enshuo Zhang
Fan Tang
Tong-Yee Lee
316
4
0
16 May 2025
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Jiachen Liu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
665
8
0
09 May 2025
Diffusion Model Quantization: A Review
Diffusion Model Quantization: A Review
Qian Zeng
Chenggong Hu
Weilong Dai
Jie Song
MQ
373
4
0
08 May 2025
A Preliminary Study on GPT-Image Generation Model for Image Restoration
A Preliminary Study on GPT-Image Generation Model for Image Restoration
Hao Yang
Yiran Yang
Ruikun Zhang
Liyuan Pan
379
2
0
08 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
1.1K
31
0
05 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
386
1
0
01 May 2025
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
Siyi Chen
Yimeng Zhang
Sijia Liu
Q. Qu
AAML
1.0K
0
0
30 Apr 2025
A Survey of Interactive Generative Video
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
432
16
0
30 Apr 2025
Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition
Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition
Yuki Hirakawa
Ryotaro Shimizu
275
0
0
28 Apr 2025
Open-set Anomaly Segmentation in Complex Scenarios
Open-set Anomaly Segmentation in Complex Scenarios
Song Xia
Yi Yu
Henghui Ding
Wenhan Yang
Shixuan Liu
Alex C. Kot
Xudong Jiang
DiffM
244
1
0
28 Apr 2025
Fast Autoregressive Models for Continuous Latent Generation
Fast Autoregressive Models for Continuous Latent Generation
Tiankai Hang
Jianmin Bao
Fangyun Wei
Dong Chen
DiffM
248
3
0
24 Apr 2025
Previous
123456...192021
Next