Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.04627
Cited By
Vector-quantized Image Modeling with Improved VQGAN
9 October 2021
Jiahui Yu
Xin Li
Jing Yu Koh
Han Zhang
Ruoming Pang
James Qin
Alexander Ku
Yuanzhong Xu
Jason Baldridge
Yonghui Wu
ViT
VLM
DRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vector-quantized Image Modeling with Improved VQGAN"
50 / 372 papers shown
Title
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
76
6
0
27 Feb 2025
Architect of the Bits World: Masked Autoregressive Modeling for Circuit Generation Guided by Truth Table
Haoyuan Wu
Haisheng Zheng
Shoubo Hu
Zhuolun He
Bei Yu
45
0
0
18 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
72
0
0
18 Feb 2025
From Principles to Applications: A Comprehensive Survey of Discrete Tokenizers in Generation, Comprehension, Recommendation, and Information Retrieval
Jian Jia
Jingtong Gao
Ben Xue
Junhao Wang
Qingpeng Cai
Quan Chen
Xiangyu Zhao
Peng Jiang
Kun Gai
OffRL
67
0
0
18 Feb 2025
MARS: Mesh AutoRegressive Model for 3D Shape Detailization
Jingnan Gao
Weizhe Liu
Weixuan Sun
Senbo Wang
Xibin Song
...
Shenzhou Chen
Hongdong Li
X. J. Yang
Yichao Yan
Pan Ji
74
2
0
17 Feb 2025
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Theodoros Kouzelis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DRL
67
5
0
17 Feb 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Z. Yang
Mike Zheng Shou
MoE
65
0
0
10 Feb 2025
UniDemoir\é: Towards Universal Image Demoir\éing with Data Generation and Synthesis
Zemin Yang
Yujing Sun
Xidong Peng
S. Yiu
Yuexin Ma
DiffM
71
1
0
10 Feb 2025
VILP: Imitation Learning with Latent Video Planning
Zhengtong Xu
Qiang Qiu
Yu She
VGen
70
1
0
03 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
J. Zhu
55
0
0
28 Jan 2025
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Dongwon Kim
Ju He
Qihang Yu
Chenglin Yang
Xiaohui Shen
Suha Kwak
Liang-Chieh Chen
VLM
46
6
0
13 Jan 2025
EditAR: Unified Conditional Generation with Autoregressive Models
Jiteng Mu
Nuno Vasconcelos
X. Wang
DiffM
38
3
0
08 Jan 2025
Learning the Language of Protein Structure
Benoit Gaujac
Jérémie Donà
Liviu Copoiu
Timothy Atkinson
Thomas Pierrot
Thomas D. Barrett
51
10
0
08 Jan 2025
DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT
Xiaotao Hu
Wei Yin
Mingkai Jia
Junyuan Deng
Xiaoyang Guo
Qian Zhang
Xiaoxiao Long
Ping Tan
VGen
34
10
0
31 Dec 2024
Hierarchical Vector Quantization for Unsupervised Action Segmentation
Federico Spurio
Emad Bahrami
Gianpiero Francesca
Juergen Gall
39
0
0
23 Dec 2024
OLiDM: Object-aware LiDAR Diffusion Models for Autonomous Driving
Tianyi Yan
Junbo Yin
Xianpeng Lang
Ruigang Yang
Cheng-Zhong Xu
Jianbing Shen
AI4CE
31
1
0
23 Dec 2024
VidTwin: Video VAE with Decoupled Structure and Dynamics
Yuchi Wang
Junliang Guo
Xinyi Xie
Tianyu He
Xu Sun
Jiang Bian
DRL
VGen
73
3
0
23 Dec 2024
When Worse is Better: Navigating the compression-generation tradeoff in visual tokenization
Vivek Ramanujan
Kushal Tirumala
Armen Aghajanyan
Luke Zettlemoyer
Ali Farhadi
DiffM
74
2
0
20 Dec 2024
Parallelized Autoregressive Visual Generation
Y. Wang
Shuhuai Ren
Zhijie Lin
Yujin Han
Haoyuan Guo
Zhenheng Yang
Difan Zou
Jiashi Feng
Xihui Liu
VGen
84
11
0
19 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
120
8
0
19 Dec 2024
Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic Segmentation
J. Zhang
Li Zhang
Shijian Li
VLM
76
0
0
18 Dec 2024
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang
Junli Cao
Vidit Goel
Guocheng Qian
Sergei Korolev
Demetri Terzopoulos
Konstantinos N. Plataniotis
Sergey Tulyakov
Jian Ren
VGen
125
11
0
16 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
H. Chen
Z. Wang
X. Li
X. Sun
Fangyi Chen
Jiang Liu
J. Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
108
6
0
14 Dec 2024
Buster: Implanting Semantic Backdoor into Text Encoder to Mitigate NSFW Content Generation
Xin Zhao
Xiaojun Chen
Yuexin Xuan
Zhendong Zhao
Xiaojun Jia
Xinfeng Li
Xiaofeng Wang
72
0
0
10 Dec 2024
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
135
2
0
09 Dec 2024
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
Yizhuo Li
Yuying Ge
Yixiao Ge
Ping Luo
Ying Shan
DiffM
VGen
90
0
0
05 Dec 2024
LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
Xiang Chen
DiffM
69
0
0
05 Dec 2024
MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model
Shan Yang
DiffM
71
0
0
02 Dec 2024
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Zichun Liao
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VGen
90
5
0
02 Dec 2024
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
Anton Voronov
Denis Kuznedelev
Mikhail Khoroshikh
Valentin Khrulkov
Dmitry Baranchuk
106
2
0
02 Dec 2024
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian Parker
Anton Smirnov
Jordi Pons
CJ Carr
Zack Zukowski
Zach Evans
Xubo Liu
75
9
0
29 Nov 2024
3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes
Tejaswini Medi
Arianna Rampini
Pradyumna Reddy
P. Jayaraman
M. Keuper
DiffM
79
0
0
28 Nov 2024
Factorized Visual Tokenization and Generation
Zechen Bai
Jianxiong Gao
Ziteng Gao
Pichao Wang
Zheng Zhang
Tong He
Mike Zheng Shou
64
3
0
25 Nov 2024
Representation Collapsing Problems in Vector Quantization
Wenhao Zhao
Qiran Zou
Rushi Shah
Dianbo Liu
67
1
0
25 Nov 2024
Efficient Online Inference of Vision Transformers by Training-Free Tokenization
Leonidas Gee
Wing Yan Li
V. Sharmanska
Novi Quadrianto
ViT
88
0
0
23 Nov 2024
GFT: Graph Foundation Model with Transferable Tree Vocabulary
Zehong Wang
Zheyuan Zhang
Nitesh V. Chawla
Chuxu Zhang
Yanfang Ye
39
9
0
09 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
46
9
0
08 Nov 2024
Scaling Laws for Pre-training Agents and World Models
Tim Pearce
Tabish Rashid
Dave Bignell
Raluca Georgescu
Sam Devlin
Katja Hofmann
LM&Ro
37
7
0
07 Nov 2024
Image Understanding Makes for A Good Tokenizer for Image Generation
Luting Wang
Yang Zhao
Zijian Zhang
Jiashi Feng
Si Liu
Bingyi Kang
VLM
26
4
0
07 Nov 2024
Adaptive Length Image Tokenization via Recurrent Allocation
Shivam Duggal
Phillip Isola
Antonio Torralba
William T. Freeman
VLM
29
5
0
04 Nov 2024
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Yongxin Zhu
B. Li
Yifei Xin
Linli Xu
36
10
0
04 Nov 2024
Bootstrapping Top-down Information for Self-modulating Slot Attention
Dongwon Kim
Seoyeon Kim
Suha Kwak
OCL
ObjD
27
0
0
04 Nov 2024
VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Yiwei Zhang
Jin Gao
Fudong Ge
Guan Luo
Bing Li
Z. Zhang
Haibin Ling
Weiming Hu
52
0
0
03 Nov 2024
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
57
30
1
01 Nov 2024
Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval
Nikolaos Flemotomos
Roger Hsiao
P. Swietojanski
Takaaki Hori
Dogan Can
Xiaodan Zhuang
44
0
0
01 Nov 2024
Constant Acceleration Flow
Dogyun Park
Sojin Lee
S. Kim
Taehoon Lee
Youngjoon Hong
Hyunwoo J. Kim
50
2
0
01 Nov 2024
Identifying Spatio-Temporal Drivers of Extreme Events
Mohamad Hakam Shams Eddin
Juergen Gall
AI4TS
48
0
0
31 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
24
9
0
28 Oct 2024
Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Liulei Li
Wenguan Wang
Y. Yang
37
7
0
26 Oct 2024
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
30
3
0
21 Oct 2024
Previous
1
2
3
4
5
6
7
8
Next