ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXivPDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 864 papers shown
Title
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image
  Diffusion Models
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models
Shwetha Ram
T. Neiman
Qianli Feng
Andrew Stuart
S. D. Tran
Trishul M. Chilimbi
72
1
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
74
0
0
28 Nov 2024
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject
  Generation
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation
Tianyi Wei
Dongdong Chen
Yifan Zhou
Xingang Pan
EGVM
77
2
0
27 Nov 2024
ModeDreamer: Mode Guiding Score Distillation for Text-to-3D Generation using Reference Image Prompts
ModeDreamer: Mode Guiding Score Distillation for Text-to-3D Generation using Reference Image Prompts
Uy Dieu Tran
Minh Luu
P. Nguyen
K. Nguyen
Binh-Son Hua
DiffM
71
1
0
27 Nov 2024
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image
  Synthesis
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis
Boming Miao
C. Li
X. U. Wang
Andi Zhang
Rui Sun
Zizhe Wang
Yao Zhu
DiffM
61
0
0
25 Nov 2024
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
Yongwei Chen
Yushi Lan
Shangchen Zhou
Tengfei Wang
Xingang Pan
100
5
0
25 Nov 2024
Efficient Online Inference of Vision Transformers by Training-Free
  Tokenization
Efficient Online Inference of Vision Transformers by Training-Free Tokenization
Leonidas Gee
Wing Yan Li
V. Sharmanska
Novi Quadrianto
ViT
85
0
0
23 Nov 2024
TPIE: Topology-Preserved Image Editing With Text Instructions
TPIE: Topology-Preserved Image Editing With Text Instructions
Nivetha Jayakumar
Srivardhan Reddy Gadila
Tonmoy Hossain
Yangfeng Ji
Miaomiao Zhang
DiffM
MedIm
71
0
0
22 Nov 2024
Text Embedding is Not All You Need: Attention Control for Text-to-Image
  Semantic Alignment with Text Self-Attention Maps
Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps
Jeeyung Kim
Erfan Esmaeili
Qiang Qiu
DiffM
81
1
0
21 Nov 2024
How to Defend Against Large-scale Model Poisoning Attacks in Federated
  Learning: A Vertical Solution
How to Defend Against Large-scale Model Poisoning Attacks in Federated Learning: A Vertical Solution
Jinbo Wang
Ruijin Wang
Fengli Zhang
FedML
AAML
24
0
0
16 Nov 2024
GSEditPro: 3D Gaussian Splatting Editing with Attention-based
  Progressive Localization
GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization
Yanhao Sun
RunZe Tian
Xiao Han
XinYao Liu
Yan Zhang
Kai Xu
3DGS
DiffM
35
2
0
15 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
55
1
0
12 Nov 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image
  Synthesis
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Yuan Yao
Gao Huang
48
4
0
11 Nov 2024
Region-Aware Text-to-Image Generation via Hard Binding and Soft
  Refinement
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Zhennan Chen
Yajie Li
Haofan Wang
Z. Chen
Zhengkai Jiang
Jun Yu Li
Qian Wang
Jian Yang
Ying Tai
DiffM
47
8
0
10 Nov 2024
Scalable, Tokenization-Free Diffusion Model Architectures with Efficient
  Initial Convolution and Fixed-Size Reusable Structures for On-Device Image
  Generation
Scalable, Tokenization-Free Diffusion Model Architectures with Efficient Initial Convolution and Fixed-Size Reusable Structures for On-Device Image Generation
Sanchar Palit
Sathya Veera Reddy Dendi
Mallikarjuna Talluri
Raj Narayana Gadde
26
0
0
09 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
46
9
0
08 Nov 2024
Clustering in Causal Attention Masking
Clustering in Causal Attention Masking
Nikita Karagodin
Yury Polyanskiy
Philippe Rigollet
52
5
0
07 Nov 2024
AsCAN: Asymmetric Convolution-Attention Networks for Efficient
  Recognition and Generation
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Anil Kag
Huseyin Coskun
Jierun Chen
Junli Cao
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
Jian Ren
46
2
0
07 Nov 2024
DomainGallery: Few-shot Domain-driven Image Generation by
  Attribute-centric Finetuning
DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric Finetuning
Yuxuan Duan
Y. Hong
Bo Zhang
Jun Lan
Huijia Zhu
Weiqiang Wang
Jianfu Zhang
Li Niu
L. Zhang
DiffM
44
0
0
07 Nov 2024
Image Understanding Makes for A Good Tokenizer for Image Generation
Image Understanding Makes for A Good Tokenizer for Image Generation
Luting Wang
Yang Zhao
Zijian Zhang
Jiashi Feng
Si Liu
Bingyi Kang
VLM
21
4
0
07 Nov 2024
ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models
ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models
Ashutosh Srivastava
Tarun Ram Menta
Abhinav Java
Avadhoot Jadhav
Silky Singh
Surgan Jandial
Balaji Krishnamurthy
DiffM
32
1
0
06 Nov 2024
Addressing Representation Collapse in Vector Quantized Models with One
  Linear Layer
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Yongxin Zhu
B. Li
Yifei Xin
Linli Xu
36
10
0
04 Nov 2024
Randomized Autoregressive Visual Generation
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
50
28
1
01 Nov 2024
Towards Unifying Understanding and Generation in the Era of Vision
  Foundation Models: A Survey from the Autoregression Perspective
Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Shenghao Xie
Wenqiang Zu
Mingyang Zhao
Duo Su
Shilong Liu
Ruohua Shi
Guoqi Li
Shanghang Zhang
Lei Ma
LRM
40
3
0
29 Oct 2024
Benchmarking Human and Automated Prompting in the Segment Anything Model
Benchmarking Human and Automated Prompting in the Segment Anything Model
Jorge Quesada
Zoe Fowler
Mohammad Alotaibi
M. Prabhushankar
Ghassan AlRegib
VLM
21
1
0
29 Oct 2024
MotionGPT-2: A General-Purpose Motion-Language Model for Motion
  Generation and Understanding
MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding
Yuan Wang
Di Huang
Yaqi Zhang
Wanli Ouyang
J. Jiao
Xuetao Feng
Yan Zhou
Pengfei Wan
Shixiang Tang
Dan Xu
VGen
25
11
0
29 Oct 2024
Adapting Diffusion Models for Improved Prompt Compliance and
  Controllable Image Synthesis
Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis
Deepak Sridhar
Abhishek Peri
Rohith Rachala
Nuno Vasconcelos
DiffM
25
0
0
29 Oct 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression
  of Neural Networks
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
28
0
0
28 Oct 2024
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Emiel Hoogeboom
Thomas Mensink
Jonathan Heek
Kay Lamerigts
Ruiqi Gao
Tim Salimans
40
6
0
25 Oct 2024
FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation
FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation
Christopher T. H. Teo
Milad Abdollahzadeh
Xinda Ma
Ngai-man Cheung
DiffM
16
1
0
24 Oct 2024
Fast constrained sampling in pre-trained diffusion models
Fast constrained sampling in pre-trained diffusion models
Alexandros Graikos
Nebojsa Jojic
Dimitris Samaras
DiffM
23
1
0
24 Oct 2024
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Shilin Lu
Zihan Zhou
Jiayou Lu
Yuanzhi Zhu
A. Kong
WIGM
78
10
0
24 Oct 2024
How to Continually Adapt Text-to-Image Diffusion Models for Flexible
  Customization?
How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Jiahua Dong
Wenqi Liang
Hongliu Li
Duzhen Zhang
Meng Cao
Henghui Ding
Salman Khan
F. Khan
DiffM
46
9
0
23 Oct 2024
DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
Haowei Zhu
Dehua Tang
Ji Liu
Mingjie Lu
Jintu Zheng
...
Spandan Tiwari
Ashish Sirasao
Jun-Hai Yong
Bin Wang
E. Barsoum
DiffM
24
4
0
22 Oct 2024
Elucidating the design space of language models for image generation
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
30
3
0
21 Oct 2024
Opportunities and Challenges of Generative-AI in Finance
Opportunities and Challenges of Generative-AI in Finance
Akshar Prabhu Desai
Ganesh Satish Mallya
Mohammad Luqman
Tejasvi Ravi
Nithya Kota
Pranjul Yadav
AIFin
31
2
0
21 Oct 2024
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Shaozhe Hao
Xuantong Liu
Xianbiao Qi
Shihao Zhao
Bojia Zi
Rong Xiao
Kai Han
Kwan-Yee K. Wong
41
3
0
18 Oct 2024
Assessing Open-world Forgetting in Generative Image Model Customization
Assessing Open-world Forgetting in Generative Image Model Customization
Héctor Laria
Alex Gomez-Villa
Imad Eddine Marouf
Bogdan Raducanu
Bogdan Raducanu
VLM
DiffM
30
0
0
18 Oct 2024
Fluid: Scaling Autoregressive Text-to-image Generative Models with
  Continuous Tokens
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan
Tianhong Li
Siyang Qin
Yuanzhen Li
Chen Sun
Michael Rubinstein
Deqing Sun
Kaiming He
Yonglong Tian
VLM
DiffM
27
40
0
17 Oct 2024
MagicTailor: Component-Controllable Personalization in Text-to-Image
  Diffusion Models
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
Donghao Zhou
Jiancheng Huang
J. Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Pheng Ann Heng
35
5
0
17 Oct 2024
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World
  Model Disentanglement
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Zhi Wang
Li Lyna Zhang
Wenhao Wu
Yuanheng Zhu
Dongbin Zhao
C. L. Philip Chen
OffRL
33
6
0
15 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive
  Modeling
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
29
6
0
14 Oct 2024
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Xiangru Zhu
Penglei Sun
Yaoxian Song
Yanghua Xiao
Zhixu Li
Chengyu Wang
Jun Huang
Bei Yang
Xiaoxiao Xu
EGVM
93
1
0
14 Oct 2024
Generating Intermediate Representations for Compositional Text-To-Image
  Generation
Generating Intermediate Representations for Compositional Text-To-Image Generation
Ran Galun
Sagie Benaim
13
0
0
13 Oct 2024
Toward Guidance-Free AR Visual Generation via Condition Contrastive
  Alignment
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Huayu Chen
Hang Su
Peize Sun
J. Zhu
VLM
36
3
0
12 Oct 2024
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
Jiatao Gu
Yuyang Wang
Yizhe Zhang
Qihang Zhang
Dinghuai Zhang
Navdeep Jaitly
Josh Susskind
Shuangfei Zhai
DiffM
31
12
0
10 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Xiangtai Li
Zhen Dong
Lei Zhu
50
13
0
10 Oct 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large
  Vision-Language Models
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Rui Zhao
Hangjie Yuan
Yujie Wei
Shiwei Zhang
Yuchao Gu
...
Xiang Wang
Zhangjie Wu
Junhao Zhang
Yingya Zhang
Mike Zheng Shou
DiffM
VLM
50
4
0
09 Oct 2024
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
Fu-Yun Wang
Ling Yang
Zhaoyang Huang
Mengdi Wang
Hongsheng Li
27
12
0
09 Oct 2024
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
Qianli Ma
Xuefei Ning
Dongrui Liu
Li Niu
Linfeng Zhang
MoMe
44
0
0
09 Oct 2024
Previous
123456...161718
Next