ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.11837
  4. Cited By
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of
  99%

Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%

17 June 2024
Lei Zhu
Fangyun Wei
Yanye Lu
Dong Chen
    VLM
ArXivPDFHTML

Papers citing "Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%"

35 / 35 papers shown
Title
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
X. Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
60
0
0
05 May 2025
Fast Autoregressive Models for Continuous Latent Generation
Fast Autoregressive Models for Continuous Latent Generation
Tiankai Hang
Jianmin Bao
Fangyun Wei
Dong Chen
DiffM
70
0
0
24 Apr 2025
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Kaihang Pan
Wang Lin
Zhongqi Yue
Tenglong Ao
Liyu Jia
Wei Zhao
Juncheng Billy Li
Siliang Tang
Hanwang Zhang
42
1
0
20 Apr 2025
ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling
ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling
Dongchao Yang
Songxiang Liu
Haohan Guo
Jiankun Zhao
Yuanyuan Wang
...
Xubo Liu
Xueyuan Chen
Xu Tan
Xixin Wu
H. Meng
58
0
0
14 Apr 2025
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
Tianwei Xiong
Jun Hao Liew
Zilong Huang
Jiashi Feng
Xihui Liu
31
0
0
11 Apr 2025
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li
L. Zhang
Zedong Wang
Juanxi Tian
Cheng Tan
...
Chang Yu
Qingsong Xie
Haonan Lu
Haoqian Wang
Zhen Lei
46
0
0
01 Apr 2025
CODA: Repurposing Continuous VAEs for Discrete Tokenization
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu
Zanlin Ni
Yeguo Hua
Xin Deng
Xiao Ma
Cheng Zhong
Gao Huang
42
0
0
22 Mar 2025
Tokenize Image as a Set
Tokenize Image as a Set
Zigang Geng
Mengde Xu
Han Hu
Shuyang Gu
DiffM
48
0
0
20 Mar 2025
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Y. Wang
Zhijie Lin
Yao Teng
Yuanzhi Zhu
Shuhuai Ren
Jiashi Feng
Xihui Liu
46
0
0
20 Mar 2025
Dual Codebook VQ: Enhanced Image Reconstruction with Reduced Codebook Size
Parisa Boodaghi Malidarreh
Jillur Rahman Saurav
T. Pham
Amir Hajighasemi
Anahita Samadi
Saurabh Shrinivas Maydeo
M. Nasr
Jacob M. Luber
46
0
0
13 Mar 2025
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Kai Qiu
X. Li
Jason Kuen
H. Chen
Xiaohao Xu
Jiuxiang Gu
Yinyi Luo
Bhiksha Raj
Zhe-nan Lin
Marios Savvides
55
0
0
11 Mar 2025
SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation
SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation
Z. Chen
Chunwei Wang
Xiuwei Chen
Hang Xu
J. Han
Xiandan Liang
VLM
69
1
0
09 Mar 2025
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Theodoros Kouzelis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DRL
67
5
0
17 Feb 2025
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
H. Chen
Z. Wang
X. Li
X. Sun
Fangyi Chen
Jiang Liu
J. Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
106
6
0
14 Dec 2024
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive
  Generation
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
X. Li
Kai Qiu
H. Chen
Jason Kuen
Jiuxiang Gu
J. Wang
Zhe-nan Lin
Bhiksha Raj
VLM
117
3
0
02 Dec 2024
Factorized Visual Tokenization and Generation
Factorized Visual Tokenization and Generation
Zechen Bai
Jianxiong Gao
Ziteng Gao
Pichao Wang
Zheng Zhang
Tong He
Mike Zheng Shou
64
3
0
25 Nov 2024
Deep Learning-Based Image Compression for Wireless Communications:
  Impacts on Reliability,Throughput, and Latency
Deep Learning-Based Image Compression for Wireless Communications: Impacts on Reliability,Throughput, and Latency
Mostafa Naseri
Pooya Ashtari
Mohamed Seif
E. De Poorter
H. Poor
A. Shahid
25
0
0
16 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
46
9
0
08 Nov 2024
Adaptive Length Image Tokenization via Recurrent Allocation
Adaptive Length Image Tokenization via Recurrent Allocation
Shivam Duggal
Phillip Isola
Antonio Torralba
William T. Freeman
VLM
29
4
0
04 Nov 2024
Addressing Representation Collapse in Vector Quantized Models with One
  Linear Layer
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Yongxin Zhu
B. Li
Yifei Xin
Linli Xu
36
10
0
04 Nov 2024
Elucidating the design space of language models for image generation
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
30
3
0
21 Oct 2024
Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Yin Xie
Kaicheng Yang
Ninghua Yang
Weimo Deng
Xiangzi Dai
...
Yumeng Wang
Xiang An
Yongle Zhao
Ziyong Feng
Jiankang Deng
MLLM
VLM
40
1
0
18 Oct 2024
ImageFolder: Autoregressive Image Generation with Folded Tokens
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Jiuxiang Gu
Bhiksha Raj
Zhe-nan Lin
VLM
34
18
0
02 Oct 2024
Quantised Global Autoencoder: A Holistic Approach to Representing Visual
  Data
Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data
Tim Elsner
Paula Usinger
Victor Czech
Gregor Kobsik
Yanjiang He
I. Lim
Leif Kobbelt
39
1
0
16 Jul 2024
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for
  Interleaved Image-Text Generation
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation
Ethan Chern
Jiadi Su
Yan Ma
Pengfei Liu
MLLM
29
26
0
08 Jul 2024
De-Diffusion Makes Text a Strong Cross-Modal Interface
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
29
10
0
01 Nov 2023
Not All Image Regions Matter: Masked Vector Quantization for
  Autoregressive Image Generation
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation
Mengqi Huang
Zhendong Mao
Quang Wang
Yongdong Zhang
VGen
DiffM
68
21
0
23 May 2023
Regularized Vector Quantization for Tokenized Image Synthesis
Regularized Vector Quantization for Tokenized Image Synthesis
Jiahui Zhang
Fangneng Zhan
Christian Theobalt
Shijian Lu
DiffM
MQ
33
30
0
11 Mar 2023
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
Chuanxia Zheng
L. Vuong
Jianfei Cai
Dinh Q. Phung
MQ
58
72
0
19 Sep 2022
Improved Vector Quantized Diffusion Models
Improved Vector Quantized Diffusion Models
Zhicong Tang
Shuyang Gu
Jianmin Bao
Dong Chen
Fang Wen
DiffM
173
63
0
31 May 2022
Autoregressive Image Generation using Residual Quantization
Autoregressive Image Generation using Residual Quantization
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
VGen
168
325
0
03 Mar 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
262
10,320
0
12 Dec 2018
Pixel Recurrent Neural Networks
Pixel Recurrent Neural Networks
Aaron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
SSeg
GAN
225
2,542
0
25 Jan 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
232
75,445
0
18 May 2015
1