ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXivPDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 865 papers shown
Title
GenTron: Diffusion Transformers for Image and Video Generation
GenTron: Diffusion Transformers for Image and Video Generation
Shoufa Chen
Mengmeng Xu
Jiawei Ren
Yuren Cong
Sen He
Yanping Xie
Animesh Sinha
Ping Luo
Tao Xiang
Juan-Manuel Perez-Rua
VGen
31
38
0
07 Dec 2023
Generating Illustrated Instructions
Generating Illustrated Instructions
Sachit Menon
Ishan Misra
Rohit Girdhar
DiffM
24
4
0
07 Dec 2023
KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion
  Models for Text-to-Image Synthesis
KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis
Youngwan Lee
Kwanyong Park
Yoorhim Cho
Yong-Ju Lee
Sung Ju Hwang
VLM
27
3
0
07 Dec 2023
Diffusion Illusions: Hiding Images in Plain Sight
Diffusion Illusions: Hiding Images in Plain Sight
R. Burgert
Xiang Li
Abe Leite
Kanchana Ranasinghe
Michael S. Ryoo
43
17
0
06 Dec 2023
Language-Informed Visual Concept Learning
Language-Informed Visual Concept Learning
Sharon Lee
Yunzhi Zhang
Shangzhe Wu
Jiajun Wu
CoGe
24
9
0
06 Dec 2023
Make-A-Storyboard: A General Framework for Storyboard with Disentangled
  and Merged Control
Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
Sitong Su
Litao Guo
Lianli Gao
Hengtao Shen
Jingkuan Song
DiffM
33
3
0
06 Dec 2023
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Felix Wimbauer
Bichen Wu
Edgar Schoenfeld
Xiaoliang Dai
Ji Hou
...
Jonas Kohler
Christian Rupprecht
Daniel Cremers
Peter Vajda
Jialiang Wang
DiffM
30
57
0
06 Dec 2023
FERGI: Automatic Annotation of User Preferences for Text-to-Image
  Generation from Spontaneous Facial Expression Reaction
FERGI: Automatic Annotation of User Preferences for Text-to-Image Generation from Spontaneous Facial Expression Reaction
Shuangquan Feng
Junhua Ma
Virginia R. de Sa
EGVM
13
0
0
05 Dec 2023
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon
Yonatan Bitton
Yonatan Shafir
Roopal Garg
Xi Chen
Dani Lischinski
Daniel Cohen-Or
Idan Szpektor
35
11
0
05 Dec 2023
Describing Differences in Image Sets with Natural Language
Describing Differences in Image Sets with Natural Language
Lisa Dunlap
Yuhui Zhang
Xiaohan Wang
Ruiqi Zhong
Trevor Darrell
Jacob Steinhardt
Joseph E. Gonzalez
Serena Yeung-Levy
CoGe
VLM
25
30
0
05 Dec 2023
MagicStick: Controllable Video Editing via Control Handle
  Transformations
MagicStick: Controllable Video Editing via Control Handle Transformations
Yue Ma
Xiaodong Cun
Yin-Yin He
Chenyang Qi
Xintao Wang
Ying Shan
Xiu Li
Qifeng Chen
VGen
14
24
0
05 Dec 2023
Customization Assistant for Text-to-image Generation
Customization Assistant for Text-to-image Generation
Yufan Zhou
Ruiyi Zhang
Jiuxiang Gu
Tongfei Sun
DiffM
25
11
0
05 Dec 2023
Stable Diffusion Exposed: Gender Bias from Prompt to Image
Stable Diffusion Exposed: Gender Bias from Prompt to Image
Yankun Wu
Yuta Nakashima
Noa Garcia
23
16
0
05 Dec 2023
Orthogonal Adaptation for Modular Customization of Diffusion Models
Orthogonal Adaptation for Modular Customization of Diffusion Models
Ryan Po
Guandao Yang
Kfir Aberman
Gordon Wetzstein
DiffM
23
26
0
05 Dec 2023
A Contrastive Compositional Benchmark for Text-to-Image Synthesis: A
  Study with Unified Text-to-Image Fidelity Metrics
A Contrastive Compositional Benchmark for Text-to-Image Synthesis: A Study with Unified Text-to-Image Fidelity Metrics
Xiangru Zhu
Penglei Sun
Chengyu Wang
Jingping Liu
Zhixu Li
Yanghua Xiao
Jun Huang
CoGe
100
5
0
04 Dec 2023
InstructBooth: Instruction-following Personalized Text-to-Image
  Generation
InstructBooth: Instruction-following Personalized Text-to-Image Generation
Daewon Chae
Nokyung Park
Jinkyu Kim
Kimin Lee
DiffM
19
11
0
04 Dec 2023
GIVT: Generative Infinite-Vocabulary Transformers
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen
Cian Eastwood
Fabian Mentzer
10
34
0
04 Dec 2023
DiverseDream: Diverse Text-to-3D Synthesis with Augmented Text Embedding
DiverseDream: Diverse Text-to-3D Synthesis with Augmented Text Embedding
Uy Dieu Tran
Minh Luu
P. Nguyen
K. Nguyen
Binh-Son Hua
30
1
0
02 Dec 2023
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting
  Activations to 3D
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
Karran Pandey
Paul Guerrero
Matheus Gadelha
Yannick Hold-Geoffroy
Karan Singh
Niloy Mitra
DiffM
21
32
0
02 Dec 2023
Enhancing Diffusion Models with 3D Perspective Geometry Constraints
Enhancing Diffusion Models with 3D Perspective Geometry Constraints
Rishi Upadhyay
Howard Zhang
Yunhao Ba
Ethan Yang
Blake Gella
Sicheng Jiang
Alex Wong
A. Kadambi
11
11
0
01 Dec 2023
Segment and Caption Anything
Segment and Caption Anything
Xiaoke Huang
Jianfeng Wang
Yansong Tang
Zheng Zhang
Han Hu
Jiwen Lu
Lijuan Wang
Zicheng Liu
MLLM
VLM
26
17
0
01 Dec 2023
DeepCache: Accelerating Diffusion Models for Free
DeepCache: Accelerating Diffusion Models for Free
Xinyin Ma
Gongfan Fang
Xinchao Wang
22
122
0
01 Dec 2023
SPOT: Self-Training with Patch-Order Permutation for Object-Centric
  Learning with Autoregressive Transformers
SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers
Ioannis Kakogeorgiou
Spyros Gidaris
Konstantinos Karantzalos
N. Komodakis
ViT
OCL
15
12
0
01 Dec 2023
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Sadeep Jayasumana
Srikumar Ramalingam
Andreas Veit
Daniel Glasner
Ayan Chakrabarti
Sanjiv Kumar
EGVM
11
126
0
30 Nov 2023
One-step Diffusion with Distribution Matching Distillation
One-step Diffusion with Distribution Matching Distillation
Tianwei Yin
Michael Gharbi
Richard Zhang
Eli Shechtman
Frédo Durand
William T. Freeman
Taesung Park
DiffM
124
219
0
30 Nov 2023
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
Sherwin Bahmani
Ivan Skorokhodov
Victor Rong
Gordon Wetzstein
Leonidas J. Guibas
Peter Wonka
Sergey Tulyakov
Jeong Joon Park
Andrea Tagliasacchi
David B. Lindell
DiffM
41
103
0
29 Nov 2023
Rethinking Image Editing Detection in the Era of Generative AI
  Revolution
Rethinking Image Editing Detection in the Era of Generative AI Revolution
Zhihao Sun
Haipeng Fang
Xinying Zhao
Danding Wang
Juan Cao
22
8
0
29 Nov 2023
DreamSync: Aligning Text-to-Image Generation with Image Understanding
  Feedback
DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback
Jiao Sun
Deqing Fu
Yushi Hu
Su Wang
Royi Rassin
...
Dana Alon
Charles Herrmann
Sjoerd van Steenkiste
Ranjay Krishna
Cyrus Rashtchian
EGVM
23
39
0
29 Nov 2023
Adversarial Diffusion Distillation
Adversarial Diffusion Distillation
Axel Sauer
Dominik Lorenz
A. Blattmann
Robin Rombach
138
329
0
28 Nov 2023
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
DiffM
21
43
0
28 Nov 2023
Reason out Your Layout: Evoking the Layout Master from Large Language
  Models for Text-to-Image Synthesis
Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis
Xiaohui Chen
Yongfei Liu
Yingxiang Yang
Jianbo Yuan
Quanzeng You
Liping Liu
Hongxia Yang
DiffM
39
11
0
28 Nov 2023
Efficient Multimodal Diffusion Models Using Joint Data Infilling with
  Partially Shared U-Net
Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net
Zizhao Hu
Shaochong Jia
Mohammad Rostami
DiffM
MedIm
19
0
0
28 Nov 2023
Text-Driven Image Editing via Learnable Regions
Text-Driven Image Editing via Learnable Regions
Yuanze Lin
Yi-Wen Chen
Yi-Hsuan Tsai
Lu Jiang
Ming-Hsuan Yang
DiffM
21
16
0
28 Nov 2023
Manifold Preserving Guided Diffusion
Manifold Preserving Guided Diffusion
Yutong He
Naoki Murata
Chieh-Hsin Lai
Yuhta Takida
Toshimitsu Uesaka
...
Wei-Hsiang Liao
Yuki Mitsufuji
J. Zico Kolter
Ruslan Salakhutdinov
Stefano Ermon
DiffM
116
64
0
28 Nov 2023
IG Captioner: Information Gain Captioners are Strong Zero-shot
  Classifiers
IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers
Chenglin Yang
Siyuan Qiao
Yuan Cao
Yu Zhang
Tao Zhu
Alan L. Yuille
Jiahui Yu
VLM
11
3
0
27 Nov 2023
Learning Disentangled Identifiers for Action-Customized Text-to-Image
  Generation
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation
Siteng Huang
Biao Gong
Yutong Feng
Xi Chen
Yu Fu
Yu Liu
Donglin Wang
DiffM
21
12
0
27 Nov 2023
Check, Locate, Rectify: A Training-Free Layout Calibration System for
  Text-to-Image Generation
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Biao Gong
Siteng Huang
Yutong Feng
Shiwei Zhang
Yuyuan Li
Yu Liu
DiffM
15
11
0
27 Nov 2023
Reinforcement Learning from Diffusion Feedback: Q* for Image Search
Reinforcement Learning from Diffusion Feedback: Q* for Image Search
Aboli Rajan Marathe
VLM
37
0
0
27 Nov 2023
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image
  Generation
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
Yuhui Zhang
Brandon McKinzie
Zhe Gan
Vaishaal Shankar
Alexander Toshev
23
3
0
27 Nov 2023
Paragraph-to-Image Generation with Information-Enriched Diffusion Model
Paragraph-to-Image Generation with Information-Enriched Diffusion Model
Weijia Wu
Zhuang Li
Yefei He
Mike Zheng Shou
Chunhua Shen
Lele Cheng
Yan Li
Tingting Gao
Di Zhang
VLM
126
24
0
24 Nov 2023
Steal My Artworks for Fine-tuning? A Watermarking Framework for
  Detecting Art Theft Mimicry in Text-to-Image Models
Steal My Artworks for Fine-tuning? A Watermarking Framework for Detecting Art Theft Mimicry in Text-to-Image Models
Ge Luo
Junqiang Huang
Manman Zhang
Zhenxing Qian
Sheng Li
Xinpeng Zhang
WIGM
15
9
0
22 Nov 2023
Diffusion Model Alignment Using Direct Preference Optimization
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace
Meihua Dang
Rafael Rafailov
Linqi Zhou
Aaron Lou
Senthil Purushwalkam
Stefano Ermon
Caiming Xiong
Shafiq R. Joty
Nikhil Naik
EGVM
33
220
0
21 Nov 2023
Explainable Time Series Anomaly Detection using Masked Latent Generative
  Modeling
Explainable Time Series Anomaly Detection using Masked Latent Generative Modeling
Daesoo Lee
Sara Malacarne
Erlend Aune
AI4TS
29
9
0
21 Nov 2023
Emu Video: Factorizing Text-to-Video Generation by Explicit Image
  Conditioning
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
Rohit Girdhar
Mannat Singh
Andrew Brown
Quentin Duval
S. Azadi
Sai Saketh Rambhatla
Akbar Shah
Xi Yin
Devi Parikh
Ishan Misra
DiffM
VGen
35
189
0
17 Nov 2023
Characterizing Tradeoffs in Language Model Decoding with Informational
  Interpretations
Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations
Chung-Ching Chang
William W. Cohen
Yun-hsuan Sung
13
0
0
16 Nov 2023
Intelligent Generation of Graphical Game Assets: A Conceptual Framework
  and Systematic Review of the State of the Art
Intelligent Generation of Graphical Game Assets: A Conceptual Framework and Systematic Review of the State of the Art
Kaisei Fukaya
Damon Daylamani-Zad
Harry Agius
45
2
0
16 Nov 2023
UFOGen: You Forward Once Large Scale Text-to-Image Generation via
  Diffusion GANs
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Yanwu Xu
Yang Zhao
Zhisheng Xiao
Tingbo Hou
129
106
0
14 Nov 2023
Holistic Evaluation of Text-To-Image Models
Holistic Evaluation of Text-To-Image Models
Tony Lee
Michihiro Yasunaga
Chenlin Meng
Yifan Mai
Joon Sung Park
...
Jun-Yan Zhu
Fei-Fei Li
Jiajun Wu
Stefano Ermon
Percy Liang
139
125
0
07 Nov 2023
De-Diffusion Makes Text a Strong Cross-Modal Interface
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
29
10
0
01 Nov 2023
The Generative AI Paradox: "What It Can Create, It May Not Understand"
The Generative AI Paradox: "What It Can Create, It May Not Understand"
Peter West
Ximing Lu
Nouha Dziri
Faeze Brahman
Linjie Li
...
Khyathi Raghavi Chandu
Benjamin Newman
Pang Wei Koh
Allyson Ettinger
Yejin Choi
AIMat
16
67
0
31 Oct 2023
Previous
123...8910...161718
Next