ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXivPDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 864 papers shown
Title
Diversity-Rewarded CFG Distillation
Diversity-Rewarded CFG Distillation
Geoffrey Cideron
A. Agostinelli
Johan Ferret
Sertan Girgin
Romuald Elie
Olivier Bachem
Sarah Perrin
Alexandre Ramé
34
2
0
08 Oct 2024
Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces
  for Large Finetuned Models
Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models
Theo Putterman
Derek Lim
Yoav Gelberg
Stefanie Jegelka
Haggai Maron
AI4CE
43
5
0
05 Oct 2024
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization
Zichen Miao
Zhengyuan Yang
Kevin Lin
Ze Wang
Zicheng Liu
Lijuan Wang
Qiang Qiu
40
3
0
04 Oct 2024
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Doohyuk Jang
Sihwan Park
J. Yang
Yeonsung Jung
Jihun Yun
Souvik Kundu
Sung-Yub Kim
Eunho Yang
40
7
0
04 Oct 2024
Eliminating Oversaturation and Artifacts of High Guidance Scales in
  Diffusion Models
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
Seyedmorteza Sadat
Otmar Hilliges
Romann M. Weber
DiffM
18
8
0
03 Oct 2024
CaLMFlow: Volterra Flow Matching using Causal Language Models
CaLMFlow: Volterra Flow Matching using Causal Language Models
Sizhuang He
Daniel Levine
Ivan Vrkic
Marco Francesco Bressana
David Zhang
S. Rizvi
Yangtian Zhang
E. Zappala
David van Dijk
17
0
0
03 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
46
23
0
03 Oct 2024
ControlAR: Controllable Image Generation with Autoregressive Models
ControlAR: Controllable Image Generation with Autoregressive Models
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
132
14
0
03 Oct 2024
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive
  Transformer for Efficient Finegrained Image Generation
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
Liang Chen
Sinan Tan
Zefan Cai
Weichu Xie
Haozhe Zhao
Yichi Zhang
Junyang Lin
Jinze Bai
Tianyu Liu
Baobao Chang
ViT
50
3
0
02 Oct 2024
Data Extrapolation for Text-to-image Generation on Small Datasets
Data Extrapolation for Text-to-image Generation on Small Datasets
Senmao Ye
Fei Liu
21
0
0
02 Oct 2024
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng
Han Shi
Xian Liu
Xuefei Ning
Guohao Dai
Yu Wang
Zhenguo Li
Xihui Liu
48
10
0
02 Oct 2024
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
Haotian Sun
Tao Lei
Bowen Zhang
Yanghao Li
Haoshuo Huang
Ruoming Pang
Bo Dai
Nan Du
DiffM
MoE
73
5
0
02 Oct 2024
MCGM: Mask Conditional Text-to-Image Generative Model
MCGM: Mask Conditional Text-to-Image Generative Model
Rami Skaik
Leonardo Rossi
Tomaso Fontanini
Andrea Prati
DiffM
28
0
0
01 Oct 2024
CusConcept: Customized Visual Concept Decomposition with Diffusion
  Models
CusConcept: Customized Visual Concept Decomposition with Diffusion Models
Zhi Xu
Shaozhe Hao
Kai Han
DiffM
25
4
0
01 Oct 2024
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We
  Learn How Vision-Language Models Function
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
Chenyi Zhuang
Ying Hu
Pan Gao
DiffM
VLM
36
12
0
30 Sep 2024
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
Wenchao Chen
Liqiang Niu
Ziyao Lu
Fandong Meng
Jie Zhou
Mamba
30
4
0
30 Sep 2024
Emu3: Next-Token Prediction is All You Need
Emu3: Next-Token Prediction is All You Need
Xinlong Wang
Xiaosong Zhang
Zhengxiong Luo
Quan-Sen Sun
Yufeng Cui
...
Xi Yang
Jingjing Liu
Yonghua Lin
Tiejun Huang
Zhongyuan Wang
MLLM
34
147
0
27 Sep 2024
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
Yi Zhang
Zhen Chen
Chih-Hong Cheng
Wenjie Ruan
Xiaowei Huang
Dezong Zhao
David Flynn
Siddartha Khastgir
Xingyu Zhao
MedIm
30
3
0
26 Sep 2024
Flexiffusion: Segment-wise Neural Architecture Search for Flexible
  Denoising Schedule
Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule
Hongtao Huang
Xiaojun Chang
Lina Yao
23
0
0
26 Sep 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
26
17
0
24 Sep 2024
MaskBit: Embedding-free Image Generation via Bit Tokens
MaskBit: Embedding-free Image Generation via Bit Tokens
Mark Weber
Lijun Yu
Qihang Yu
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
DiffM
49
27
0
24 Sep 2024
TFG: Unified Training-Free Guidance for Diffusion Models
TFG: Unified Training-Free Guidance for Diffusion Models
Haotian Ye
Haowei Lin
Jiaqi Han
Minkai Xu
Sheng Liu
Yitao Liang
Jianzhu Ma
James Zou
Stefano Ermon
24
13
0
24 Sep 2024
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
Zhecan Wang
Junzhang Liu
Chia-Wei Tang
Hani Alomari
Anushka Sivakumar
...
Haoxuan You
A. Ishmam
Kai-Wei Chang
Shih-Fu Chang
Chris Thomas
CoGe
VLM
59
2
0
19 Sep 2024
Generalizing Alignment Paradigm of Text-to-Image Generation with
  Preferences through $f$-divergence Minimization
Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences through fff-divergence Minimization
Haoyuan Sun
Bo Xia
Yongzhe Chang
Xueqian Wang
EGVM
35
2
0
15 Sep 2024
TextureDiffusion: Target Prompt Disentangled Editing for Various Texture Transfer
TextureDiffusion: Target Prompt Disentangled Editing for Various Texture Transfer
Zihan Su
Junhao Zhuang
Chun Yuan
DiffM
34
0
0
15 Sep 2024
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection
Yaning Zhang
Tianyi Wang
Zitong Yu
Zan Gao
Linlin Shen
Shengyong Chen
DiffM
65
3
0
15 Sep 2024
SongCreator: Lyrics-based Universal Song Generation
SongCreator: Lyrics-based Universal Song Generation
Shun Lei
Yixuan Zhou
Boshi Tang
Max W. Y. Lam
Feng Liu
Hangyu Liu
Jingcheng Wu
Shiyin Kang
Zhiyong Wu
Helen Meng
36
4
0
09 Sep 2024
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
Haiyu Wu
Jaskirat Singh
Sicong Tian
Liang Zheng
Kevin W. Bowyer
CVBM
42
3
0
04 Sep 2024
Accurate Compression of Text-to-Image Diffusion Models via Vector
  Quantization
Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization
Vage Egiazarian
Denis Kuznedelev
Anton Voronov
Ruslan Svirschevski
Michael Goin
Daniil Pavlov
Dan Alistarh
Dmitry Baranchuk
MQ
31
0
0
31 Aug 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
25
7
0
31 Aug 2024
One-Shot Learning Meets Depth Diffusion in Multi-Object Videos
One-Shot Learning Meets Depth Diffusion in Multi-Object Videos
Anisha Jain
VGen
DiffM
MDE
24
1
0
29 Aug 2024
Are Pose Estimators Ready for the Open World? STAGE: Synthetic Data
  Generation Toolkit for Auditing 3D Human Pose Estimators
Are Pose Estimators Ready for the Open World? STAGE: Synthetic Data Generation Toolkit for Auditing 3D Human Pose Estimators
Nikita Kister
István Sárándi
Anna Khoreva
Gerard Pons-Moll
47
0
0
28 Aug 2024
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
  Teacher
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
T. Dao
Thuan Hoang Nguyen
T. Le
D. Vu
Khoi Nguyen
Cuong Pham
Anh Tran
DiffM
29
11
0
26 Aug 2024
Transfusion: Predict the Next Token and Diffuse Images with One
  Multi-Modal Model
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou
Lili Yu
Arun Babu
Kushal Tirumala
Michihiro Yasunaga
Leonid Shamis
Jacob Kahn
Xuezhe Ma
Luke Zettlemoyer
Omer Levy
DiffM
23
145
0
20 Aug 2024
Can Large Language Models Understand Symbolic Graphics Programs?
Can Large Language Models Understand Symbolic Graphics Programs?
Zeju Qiu
Weiyang Liu
Haiwen Feng
Zhen Liu
Tim Z. Xiao
Katherine M. Collins
J. Tenenbaum
Adrian Weller
Michael J. Black
Bernhard Schölkopf
46
11
0
15 Aug 2024
Multitask and Multimodal Neural Tuning for Large Models
Multitask and Multimodal Neural Tuning for Large Models
Hao Sun
Yu Song
Jihong Hu
Yen-Wei Chen
Lanfen Lin
VLM
16
0
0
06 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Yu Qiao
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
60
48
0
05 Aug 2024
LEGO: Self-Supervised Representation Learning for Scene Text Images
LEGO: Self-Supervised Representation Learning for Scene Text Images
Yujin Ren
Jiaxin Zhang
Lianwen Jin
SSL
26
0
0
04 Aug 2024
Autonomous LLM-Enhanced Adversarial Attack for Text-to-Motion
Autonomous LLM-Enhanced Adversarial Attack for Text-to-Motion
Honglei Miao
Fan Ma
Ruijie Quan
Kun Zhan
Yi Yang
AAML
34
0
0
01 Aug 2024
Fine-gained Zero-shot Video Sampling
Fine-gained Zero-shot Video Sampling
Dengsheng Chen
Jie Hu
Javier Segovia-Aguas
Enhua Wu
VGen
DiffM
16
0
0
31 Jul 2024
MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented
  Generation via Knowledge-enhanced Reranking and Noise-injected Training
MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected Training
Rivik Setty
Chengjin Xu
Vinay Setty
Jian Guo
27
12
0
31 Jul 2024
Contrasting Deepfakes Diffusion via Contrastive Learning and
  Global-Local Similarities
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi
Federico Cocchi
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
23
7
0
29 Jul 2024
Diffusion Models for Multi-Task Generative Modeling
Diffusion Models for Multi-Task Generative Modeling
Changyou Chen
Han Ding
Bunyamin Sisman
Yi Tian Xu
Ouye Xie
Benjamin Z. Yao
Son Dinh Tran
Belinda Zeng
DiffM
35
4
0
24 Jul 2024
Stretching Each Dollar: Diffusion Training from Scratch on a
  Micro-Budget
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Vikash Sehwag
Xianghao Kong
Jingtao Li
Michael Spranger
Lingjuan Lyu
DiffM
32
8
0
22 Jul 2024
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal
  Reasoning
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning
Zhecan Wang
Garrett Bingham
Adams Wei Yu
Quoc V. Le
Thang Luong
Golnaz Ghiasi
MLLM
LRM
35
9
0
22 Jul 2024
LSReGen: Large-Scale Regional Generator via Backward Guidance Framework
LSReGen: Large-Scale Regional Generator via Backward Guidance Framework
Bowen Zhang
Cheng Yang
Xuanhui Liu
DiffM
14
0
0
21 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current
  Status, Challenges, and Perspectives
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
20
21
0
20 Jul 2024
Safe-SD: Safe and Traceable Stable Diffusion with Text Prompt Trigger
  for Invisible Generative Watermarking
Safe-SD: Safe and Traceable Stable Diffusion with Text Prompt Trigger for Invisible Generative Watermarking
Zhiyuan Ma
Guoli Jia
Biqing Qi
Bowen Zhou
WIGM
58
9
0
18 Jul 2024
Exploring the Potentials and Challenges of Deep Generative Models in
  Product Design Conception
Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception
Phillip Mueller
Lars Mikelsons
AI4CE
30
1
0
15 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A
  Survey
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
37
13
0
10 Jul 2024
Previous
12345...161718
Next