Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2507.08441
Cited By
v1
v2 (latest)
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation
11 July 2025
Anlin Zheng
Xin Wen
Xuanyang Zhang
Chuofan Ma
Tiancai Wang
Gang Yu
Xiangyu Zhang
Xiaojuan Qi
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (56 upvotes)
Github (38★)
Papers citing
"Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation"
4 / 4 papers shown
Diffusion Transformers with Representation Autoencoders
Boyang Zheng
Nanye Ma
Shengbang Tong
Saining Xie
DiffM
205
44
0
13 Oct 2025
UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation
Zhengrong Yue
H. Zhang
Xiangyu Zeng
Boyu Chen
Chenting Wang
...
Lu Dong
Kunpeng Du
Yi Wang
Limin Wang
Yali Wang
189
7
0
12 Oct 2025
Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models
Bowei Chen
Sai Bi
Hao Tan
Chentao Song
Tianyuan Zhang
Zhengqi Li
Yuanjun Xiong
Jianming Zhang
Kai Zhang
212
4
0
29 Sep 2025
WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction
Shaobin Zhuang
Yiwei Guo
Canmiao Fu
Z. Huang
Zeyue Tian
Ying Zhang
Ying Zhang
Chen Li
Yali Wang
ViT
223
2
0
07 Aug 2025
1