ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXiv (abs)PDFHTMLHuggingFace (4 upvotes)

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 1,010 papers shown
Inst-Inpaint: Instructing to Remove Objects with Diffusion Models
Inst-Inpaint: Instructing to Remove Objects with Diffusion Models
Ahmet Burak Yildirim
Vedat Baday
Erkut Erdem
Aykut Erdem
Aysegül Dündar
DiffM
310
81
0
06 Apr 2023
Taming Encoder for Zero Fine-tuning Image Customization with
  Text-to-Image Diffusion Models
Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models
Xuhui Jia
Yang Zhao
Kelvin C. K. Chan
Yandong Li
Han-Ying Zhang
Boqing Gong
Tingbo Hou
Jian Shu
Yu-Chuan Su
DiffM
219
123
0
05 Apr 2023
GINA-3D: Learning to Generate Implicit Neural Assets in the Wild
GINA-3D: Learning to Generate Implicit Neural Assets in the WildComputer Vision and Pattern Recognition (CVPR), 2023
Bokui Shen
Xinchen Yan
C. Qi
Mahyar Najibi
Boyang Deng
Leonidas Guibas
Yin Zhou
Drago Anguelov
3DV
327
30
0
04 Apr 2023
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image
  Generation
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image GenerationComputer Vision and Pattern Recognition (CVPR), 2023
Mayu Otani
Riku Togashi
Yu Sawai
Ryosuke Ishigami
Yuta Nakashima
Esa Rahtu
J. Heikkilä
Shiníchi Satoh
224
78
0
04 Apr 2023
Text-Conditioned Sampling Framework for Text-to-Image Generation with
  Masked Generative Models
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Jaewoong Lee
Sang-Sub Jang
Jaehyeong Jo
Jaehong Yoon
Yunji Kim
Jin-Hwa Kim
Jung-Woo Ha
Sung Ju Hwang
DiffM
239
7
0
04 Apr 2023
Scientists' Perspectives on the Potential for Generative AI in their
  Fields
Scientists' Perspectives on the Potential for Generative AI in their Fields
Meredith Ringel Morris
AI4CE
157
47
0
04 Apr 2023
Subject-driven Text-to-Image Generation via Apprenticeship Learning
Subject-driven Text-to-Image Generation via Apprenticeship LearningNeural Information Processing Systems (NeurIPS), 2023
Wenhu Chen
Hexiang Hu
Yandong Li
Nataniel Rui
Xuhui Jia
Ming-Wei Chang
William W. Cohen
DiffM
921
227
0
01 Apr 2023
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Eric Zhang
Kai Wang
Xingqian Xu
Zinan Lin
Humphrey Shi
DiffM
351
260
0
30 Mar 2023
Discriminative Class Tokens for Text-to-Image Diffusion Models
Discriminative Class Tokens for Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Idan Schwartz
Vésteinn Snaebjarnarson
Hila Chefer
Robert Bamler
Serge Belongie
Lior Wolf
Sagie Benaim
399
12
0
30 Mar 2023
Qualitative Failures of Image Generation Models and Their Application in
  Detecting Deepfakes
Qualitative Failures of Image Generation Models and Their Application in Detecting DeepfakesImage and Vision Computing (IVC), 2023
Ali Borji
504
42
0
29 Mar 2023
Planning with Sequence Models through Iterative Energy Minimization
Planning with Sequence Models through Iterative Energy MinimizationInternational Conference on Learning Representations (ICLR), 2023
Hongyi Chen
Yilun Du
Yiye Chen
J. Tenenbaum
Patricio A. Vela
167
8
0
28 Mar 2023
Variational Distribution Learning for Unsupervised Text-to-Image
  Generation
Variational Distribution Learning for Unsupervised Text-to-Image GenerationComputer Vision and Pattern Recognition (CVPR), 2023
Minsoo Kang
Doyup Lee
Jiseob Kim
Saehoon Kim
Bohyung Han
DRLOOD
194
4
0
28 Mar 2023
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
Senmao Li
Joost van de Weijer
Taihang Hu
Fahad Shahbaz Khan
Qibin Hou
Yaxing Wang
Jian Yang
DiffM
398
75
0
28 Mar 2023
Anti-DreamBooth: Protecting users from personalized text-to-image
  synthesis
Anti-DreamBooth: Protecting users from personalized text-to-image synthesisIEEE International Conference on Computer Vision (ICCV), 2023
T. Le
Hao Phung
Thuan Hoang Nguyen
Quan Dao
Ngoc N. Tran
Anh Tran
359
134
0
27 Mar 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Text-to-Image Diffusion Models are Zero-Shot ClassifiersNeural Information Processing Systems (NeurIPS), 2023
Kevin Clark
P. Jaini
DiffMVLM
381
149
0
27 Mar 2023
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Seer: Language Instructed Video Prediction with Latent Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2023
Xianfan Gu
Chuan Wen
Weirui Ye
Jiaming Song
Yang Gao
DiffMVGen
307
55
0
27 Mar 2023
Equivariant Similarity for Vision-Language Foundation Models
Equivariant Similarity for Vision-Language Foundation ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Tan Wang
Kevin Qinghong Lin
Linjie Li
Chung-Ching Lin
Zhengyuan Yang
Hanwang Zhang
Zicheng Liu
Lijuan Wang
CoGe
282
63
0
25 Mar 2023
Freestyle Layout-to-Image Synthesis
Freestyle Layout-to-Image SynthesisComputer Vision and Pattern Recognition (CVPR), 2023
Han Xue
Z. Huang
Qianru Sun
Li Song
Wenjun Zhang
DiffM
327
82
0
25 Mar 2023
High Fidelity Image Synthesis With Deep VAEs In Latent Space
High Fidelity Image Synthesis With Deep VAEs In Latent Space
Troy Luhman
Eric Luhman
DRL3DV
139
12
0
23 Mar 2023
Ablating Concepts in Text-to-Image Diffusion Models
Ablating Concepts in Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Nupur Kumari
Bin Zhang
Sheng-Yu Wang
Eli Shechtman
Richard Y. Zhang
Jun-Yan Zhu
VLM
482
283
0
23 Mar 2023
DreamBooth3D: Subject-Driven Text-to-3D Generation
DreamBooth3D: Subject-Driven Text-to-3D GenerationIEEE International Conference on Computer Vision (ICCV), 2023
Amit Raj
S. Kaza
Ben Poole
Michael Niemeyer
Nataniel Ruiz
...
Kfir Aberman
Michael Rubinstein
Jonathan T. Barron
Yuanzhen Li
Varun Jampani
DiffM
317
268
0
23 Mar 2023
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
CoBIT: A Contrastive Bi-directional Image-Text Generation ModelInternational Conference on Learning Representations (ICLR), 2023
Haoxuan You
Mandy Guo
Zhecan Wang
Kai-Wei Chang
Jason Baldridge
Jiahui Yu
DiffM
210
14
0
23 Mar 2023
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video
  Generators
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video GeneratorsIEEE International Conference on Computer Vision (ICCV), 2023
Levon Khachatryan
A. Movsisyan
Vahram Tadevosyan
Roberto Henschel
Zinan Lin
Shant Navasardyan
Humphrey Shi
VGen
308
733
0
23 Mar 2023
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo SupervisionComputer Vision and Pattern Recognition (CVPR), 2023
Jiacheng Wei
Hao Wang
Jiashi Feng
Guosheng Lin
Kim-Hui Yap
150
38
0
23 Mar 2023
A Word is Worth a Thousand Pictures: Prompts as AI Design Material
A Word is Worth a Thousand Pictures: Prompts as AI Design Material
Chinmay Kulkarni
Stefania Druga
Minsuk Chang
Alexander J. Fiannaca
Carrie J. Cai
Michael Terry
3DV
141
42
0
22 Mar 2023
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Sheng-Siang Yin
Chenfei Wu
Huan Yang
Jianfeng Wang
Xiaodong Wang
...
Gong Ming
Lijuan Wang
Zicheng Liu
Houqiang Li
Nan Duan
VGen
245
178
0
22 Mar 2023
The Prompt Artists
The Prompt ArtistsCreativity & Cognition (C&C), 2023
Minsuk Chang
Stefania Druga
Alexander J. Fiannaca
P. Vergani
Chinmay Kulkarni
Carrie J. Cai
Michael Terry
162
89
0
22 Mar 2023
MAGVLT: Masked Generative Vision-and-Language Transformer
MAGVLT: Masked Generative Vision-and-Language TransformerComputer Vision and Pattern Recognition (CVPR), 2023
Sungwoong Kim
DaeJin Jo
Donghoon Lee
Jongmin Kim
VLM
129
16
0
21 Mar 2023
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation
  with Question Answering
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question AnsweringIEEE International Conference on Computer Vision (ICCV), 2023
Yushi Hu
Benlin Liu
Jungo Kasai
Yizhong Wang
Mari Ostendorf
Ranjay Krishna
Noah A. Smith
EGVM
337
344
0
21 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
  GPT-5 All You Need?
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
303
199
0
21 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the
  Future
Large AI Models in Health Informatics: Applications, Challenges, and the FutureIEEE journal of biomedical and health informatics (IEEE JBHI), 2023
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny Lo
AI4MHLM&MA
284
185
0
21 Mar 2023
Localizing Object-level Shape Variations with Text-to-Image Diffusion
  Models
Localizing Object-level Shape Variations with Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Or Patashnik
Daniel Garibi
Idan Azuri
Hadar Averbuch-Elor
Daniel Cohen-Or
DiffM
399
143
0
20 Mar 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
Retrieving Multimodal Information for Augmented Generation: A SurveyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
...
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Shafiq Joty
416
128
0
20 Mar 2023
Deep Image Fingerprint: Towards Low Budget Synthetic Image Detection and
  Model Lineage Analysis
Deep Image Fingerprint: Towards Low Budget Synthetic Image Detection and Model Lineage AnalysisIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Sergey Sinitsa
Ohad Fried
243
27
0
19 Mar 2023
IRGen: Generative Modeling for Image Retrieval
IRGen: Generative Modeling for Image RetrievalEuropean Conference on Computer Vision (ECCV), 2023
Yidan Zhang
Ting Zhang
Dong Chen
Yujing Wang
Qi Chen
...
Tao Gui
Fan Yang
Mao Yang
Q. Liao
B. Guo
3DVVLM
326
21
0
17 Mar 2023
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
GlueGen: Plug and Play Multi-modal Encoders for X-to-image GenerationIEEE International Conference on Computer Vision (ICCV), 2023
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
DiffM
378
27
0
17 Mar 2023
HIVE: Harnessing Human Feedback for Instructional Visual Editing
HIVE: Harnessing Human Feedback for Instructional Visual EditingComputer Vision and Pattern Recognition (CVPR), 2023
Shu Zhen Zhang
Xinyi Yang
Yihao Feng
Can Qin
Chia-Chih Chen
...
Haiquan Wang
Silvio Savarese
Stefano Ermon
Caiming Xiong
Ran Xu
327
163
0
16 Mar 2023
StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized
  Tokenizer of a Large-Scale Generative Model
StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative ModelIEEE International Conference on Computer Vision (ICCV), 2023
Zipeng Xu
E. Sangineto
Andrii Zadaianchuk
DiffM
285
14
0
16 Mar 2023
Text-to-image Diffusion Models in Generative AI: A Survey
Text-to-image Diffusion Models in Generative AI: A Survey
Chenshuang Zhang
Chaoning Zhang
Mengchun Zhang
In So Kweon
VLM
315
380
0
14 Mar 2023
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
One Transformer Fits All Distributions in Multi-Modal Diffusion at ScaleInternational Conference on Machine Learning (ICML), 2023
Fan Bao
Shen Nie
Kaiwen Xue
Chongxuan Li
Shiliang Pu
Yaole Wang
Gang Yue
Yue Cao
Hang Su
Jun Zhu
DiffM
534
177
0
12 Mar 2023
Scaling up GANs for Text-to-Image Synthesis
Scaling up GANs for Text-to-Image SynthesisComputer Vision and Pattern Recognition (CVPR), 2023
Minguk Kang
Jun-Yan Zhu
Richard Y. Zhang
Jaesik Park
Eli Shechtman
Sylvain Paris
Taesung Park
328
601
0
09 Mar 2023
Cones: Concept Neurons in Diffusion Models for Customized Generation
Cones: Concept Neurons in Diffusion Models for Customized GenerationInternational Conference on Machine Learning (ICML), 2023
Zhiheng Liu
Ruili Feng
Kai Zhu
Yifei Zhang
Kecheng Zheng
Yu Liu
Deli Zhao
Jingren Zhou
Yang Cao
DiffM
300
152
0
09 Mar 2023
disco: a toolkit for Distributional Control of Generative Models
disco: a toolkit for Distributional Control of Generative ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Germán Kruszewski
Jos Rozen
Marc Dymetman
213
4
0
08 Mar 2023
Video-P2P: Video Editing with Cross-attention Control
Video-P2P: Video Editing with Cross-attention ControlComputer Vision and Pattern Recognition (CVPR), 2023
Shaoteng Liu
Yuechen Zhang
Wenbo Li
Zhe Lin
Jiaya Jia
DiffMVGen
391
308
0
08 Mar 2023
Vector Quantized Time Series Generation with a Bidirectional Prior Model
Vector Quantized Time Series Generation with a Bidirectional Prior ModelInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Daesoo Lee
Sara Malacarne
Erlend Aune
BDL
310
42
0
08 Mar 2023
A Prompt Log Analysis of Text-to-Image Generation Systems
A Prompt Log Analysis of Text-to-Image Generation SystemsThe Web Conference (WWW), 2023
Yutong Xie
Zhaoying Pan
Jing Ma
Jie Luo
Qiaozhu Mei
DiffM
295
56
0
08 Mar 2023
ELODIN: Naming Concepts in Embedding Spaces
ELODIN: Naming Concepts in Embedding Spaces
Rodrigo Mello
Filipe Calegario
Geber Ramalho
DiffM
310
1
0
07 Mar 2023
Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding
Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding
Jiacheng Li
Longhui Wei
Zongyuan Zhan
Xinfu He
Siliang Tang
Qi Tian
Yueting Zhuang
157
5
0
07 Mar 2023
A Complete Recipe for Diffusion Generative Models
A Complete Recipe for Diffusion Generative ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Kushagra Pandey
Stephan Mandt
DiffM
207
12
0
03 Mar 2023
A Pathway Towards Responsible AI Generated Content
A Pathway Towards Responsible AI Generated ContentInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Chen Chen
Jie Fu
Lingjuan Lyu
344
82
0
02 Mar 2023
Previous
123...161718192021
Next