ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.06525
  4. Cited By
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

10 June 2024
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
    VLM
ArXivPDFHTML

Papers citing "Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation"

50 / 177 papers shown
Title
Next Patch Prediction for Autoregressive Visual Generation
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
114
8
0
19 Dec 2024
Parallelized Autoregressive Visual Generation
Parallelized Autoregressive Visual Generation
Y. Wang
Shuhuai Ren
Zhijie Lin
Yujin Han
Haoyuan Guo
Zhenheng Yang
Difan Zou
Jiashi Feng
Xihui Liu
VGen
84
11
0
19 Dec 2024
E-CAR: Efficient Continuous Autoregressive Image Generation via
  Multistage Modeling
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Zhihang Yuan
Yuzhang Shang
H. Zhang
Tongcheng Fang
Rui Xie
Bingxin Xu
Yan Yan
Shengen Yan
Guohao Dai
Yu Wang
DiffM
86
1
0
18 Dec 2024
Self-control: A Better Conditional Mechanism for Masked Autoregressive
  Model
Self-control: A Better Conditional Mechanism for Masked Autoregressive Model
Qiaoying Qu
Shiyu Shen
DiffM
64
0
0
18 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
H. Chen
Z. Wang
X. Li
X. Sun
Fangyi Chen
Jiang Liu
J. Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
103
6
0
14 Dec 2024
[MASK] is All You Need
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
135
2
0
09 Dec 2024
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models
  with Knowledge-Intensive Concepts
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
Ziwei Huang
Wanggui He
Quanyu Long
Yandi Wang
Haoyuan Li
...
Fangxun Shu
Long Chen
Hao Jiang
Leilei Gan
Fei Wu
EGVM
100
3
0
05 Dec 2024
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Junjie Wen
Minjie Zhu
Y. X. Zhu
Zhibin Tang
Jinming Li
...
Chengmeng Li
Xiaoyu Liu
Yaxin Peng
Chaomin Shen
Feifei Feng
85
10
0
04 Dec 2024
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Ziqi Pang
Tianyuan Zhang
Fujun Luan
Yunze Man
Hao Tan
Kai Zhang
William T. Freeman
Yu-Xiong Wang
VGen
59
8
0
02 Dec 2024
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive
  Generation
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
X. Li
Kai Qiu
H. Chen
Jason Kuen
Jiuxiang Gu
J. Wang
Zhe-nan Lin
Bhiksha Raj
VLM
112
3
0
02 Dec 2024
Unleashing In-context Learning of Autoregressive Models for Few-shot
  Image Manipulation
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai
F. Xu
Miao Liu
Xiaoliang Dai
Nikhil Mehta
...
Zeyi Huang
James M. Rehg
Sangmin Lee
Ning Zhang
Tong Xiao
71
2
0
02 Dec 2024
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
Anton Voronov
Denis Kuznedelev
Mikhail Khoroshikh
Valentin Khrulkov
Dmitry Baranchuk
103
2
0
02 Dec 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
105
5
0
28 Nov 2024
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient
Zigeng Chen
Xinyin Ma
Gongfan Fang
Xinchao Wang
VLM
87
4
0
26 Nov 2024
Factorized Visual Tokenization and Generation
Factorized Visual Tokenization and Generation
Zechen Bai
Jianxiong Gao
Ziteng Gao
Pichao Wang
Zheng Zhang
Tong He
Mike Zheng Shou
58
3
0
25 Nov 2024
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
Yongwei Chen
Yushi Lan
Shangchen Zhou
Tengfei Wang
Xingang Pan
95
5
0
25 Nov 2024
PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs
PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs
Teng Zhou
Xiaoyu Zhang
Yongchuan Tang
MLLM
DiffM
76
0
0
24 Nov 2024
Efficient Online Inference of Vision Transformers by Training-Free
  Tokenization
Efficient Online Inference of Vision Transformers by Training-Free Tokenization
Leonidas Gee
Wing Yan Li
V. Sharmanska
Novi Quadrianto
ViT
77
0
0
23 Nov 2024
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
Shitong Shao
Zikai Zhou
Tian Ye
Lichen Bai
Zhiqiang Xu
Zeke Xie
DiffM
44
0
0
16 Nov 2024
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video
  Generation
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Xiaofeng Wang
Kang Zhao
F. Liu
Jiayu Wang
Guosheng Zhao
Xiaoyi Bao
Zheng Hua Zhu
Yingya Zhang
Xingang Wang
VGen
53
5
0
13 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
46
9
0
08 Nov 2024
Analyzing The Language of Visual Tokens
Analyzing The Language of Visual Tokens
David M. Chan
Rodolfo Corona
J. S. Park
Cheol Jun Cho
Yutong Bai
Trevor Darrell
21
2
0
07 Nov 2024
Textual Decomposition Then Sub-motion-space Scattering for
  Open-Vocabulary Motion Generation
Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
Ke Fan
J. Zhang
Ran Yi
Jingyu Gong
Yabiao Wang
Yating Wang
Xin Tan
Chengjie Wang
Lizhuang Ma
29
2
0
06 Nov 2024
GenXD: Generating Any 3D and 4D Scenes
GenXD: Generating Any 3D and 4D Scenes
Yuyang Zhao
Chung-Ching Lin
Kevin Qinghong Lin
Zhiwen Yan
Linjie Li
Z. Yang
Jianfeng Wang
G. Lee
Lijuan Wang
VGen
43
14
0
04 Nov 2024
Addressing Representation Collapse in Vector Quantized Models with One
  Linear Layer
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Yongxin Zhu
B. Li
Yifei Xin
Linli Xu
30
10
0
04 Nov 2024
Randomized Autoregressive Visual Generation
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
50
28
1
01 Nov 2024
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models
Arash Marioriyad
Parham Rezaei
M. Baghshah
M. Rohban
CoGe
49
0
0
30 Oct 2024
Towards Unifying Understanding and Generation in the Era of Vision
  Foundation Models: A Survey from the Autoregression Perspective
Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Shenghao Xie
Wenqiang Zu
Mingyang Zhao
Duo Su
Shilong Liu
Ruohua Shi
Guoqi Li
Shanghang Zhang
Lei Ma
LRM
40
3
0
29 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
16
9
0
28 Oct 2024
Where Am I and What Will I See: An Auto-Regressive Model for Spatial
  Localization and View Prediction
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction
Junyi Chen
Di Huang
Weicai Ye
Wanli Ouyang
Tong He
LRM
28
1
0
24 Oct 2024
Elucidating the design space of language models for image generation
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
25
3
0
21 Oct 2024
SEA: State-Exchange Attention for High-Fidelity Physics Based
  Transformers
SEA: State-Exchange Attention for High-Fidelity Physics Based Transformers
Parsa Esmati
Amirhossein Dadashzadeh
Vahid Goodarzi
Nicolas Larrosa
Nicolo Grilli
19
0
0
20 Oct 2024
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Shaozhe Hao
Xuantong Liu
Xianbiao Qi
Shihao Zhao
Bojia Zi
Rong Xiao
Kai Han
Kwan-Yee K. Wong
36
3
0
18 Oct 2024
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding
  and Generation
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Chengyue Wu
Xiaokang Chen
Z. F. Wu
Yiyang Ma
Xingchao Liu
...
Wen Liu
Zhenda Xie
Xingkai Yu
Chong Ruan
Ping Luo
AI4TS
44
70
0
17 Oct 2024
MEV Capture Through Time-Advantaged Arbitrage
MEV Capture Through Time-Advantaged Arbitrage
Robin Fritsch
Maria Ines Silva
A. Mamageishvili
Benjamin Livshits
E. Felten
21
1
0
14 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive
  Modeling
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
29
6
0
14 Oct 2024
Toward Guidance-Free AR Visual Generation via Condition Contrastive
  Alignment
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Huayu Chen
Hang Su
Peize Sun
J. Zhu
VLM
25
3
0
12 Oct 2024
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
Jiatao Gu
Yuyang Wang
Yizhe Zhang
Qihang Zhang
Dinghuai Zhang
Navdeep Jaitly
Josh Susskind
Shuangfei Zhai
DiffM
31
12
0
10 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Xiangtai Li
Zhen Dong
Lei Zhu
46
13
0
10 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
34
2
0
10 Oct 2024
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
Onkar Susladkar
Jishu Sen Gupta
Chirag Sehgal
Sparsh Mittal
Rekha Singhal
DiffM
VGen
28
0
0
10 Oct 2024
CAR: Controllable Autoregressive Modeling for Visual Generation
CAR: Controllable Autoregressive Modeling for Visual Generation
Ziyu Yao
Jialin Li
Yifeng Zhou
Yong Liu
Xi Jiang
Chengjie Wang
Feng Zheng
Yuexian Zou
Lei Li
DiffM
35
13
0
07 Oct 2024
Did You Hear That? Introducing AADG: A Framework for Generating
  Benchmark Data in Audio Anomaly Detection
Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
Ksheeraja Raghavan
Samiran Gode
Ankit Parag Shah
Surabhi Raghavan
Wolfram Burgard
Bhiksha Raj
Rita Singh
25
0
0
04 Oct 2024
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Doohyuk Jang
Sihwan Park
J. Yang
Yeonsung Jung
Jihun Yun
Souvik Kundu
Sung-Yub Kim
Eunho Yang
33
7
0
04 Oct 2024
ControlAR: Controllable Image Generation with Autoregressive Models
ControlAR: Controllable Image Generation with Autoregressive Models
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
132
14
0
03 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
42
23
0
03 Oct 2024
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive
  Transformer for Efficient Finegrained Image Generation
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
Liang Chen
Sinan Tan
Zefan Cai
Weichu Xie
Haozhe Zhao
Yichi Zhang
Junyang Lin
Jinze Bai
Tianyu Liu
Baobao Chang
ViT
44
3
0
02 Oct 2024
ImageFolder: Autoregressive Image Generation with Folded Tokens
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Jiuxiang Gu
Bhiksha Raj
Zhe-nan Lin
VLM
31
17
0
02 Oct 2024
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng
Han Shi
Xian Liu
Xuefei Ning
Guohao Dai
Yu Wang
Zhenguo Li
Xihui Liu
46
10
0
02 Oct 2024
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
Wenchao Chen
Liqiang Niu
Ziyao Lu
Fandong Meng
Jie Zhou
Mamba
22
4
0
30 Sep 2024
Previous
1234
Next