Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.10030
Cited By
Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
15 March 2024
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers"
5 / 5 papers shown
Title
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang
Jiang-Long Liu
Z. Wang
Xiaodong Yu
Jialian Wu
X. Sun
Yusheng Su
Alan L. Yuille
Zicheng Liu
Emad Barsoum
DiffM
VGen
46
0
0
13 Apr 2025
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
110
3
0
22 Nov 2024
Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information
Yi Chen
Jian Xu
Xu-Yao Zhang
Wen-Zhuo Liu
Yang-Yang Liu
Cheng-Lin Liu
24
3
0
02 Sep 2024
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,604
0
24 Feb 2021
1