Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.16148
Cited By
Accelerating Transformers with Spectrum-Preserving Token Merging
25 May 2024
Hoai-Chau Tran
D. M. Nguyen
Duy M. Nguyen
Trung Thanh Nguyen
Ngan Le
Pengtao Xie
Daniel Sonntag
James Y. Zou
Binh T. Nguyen
Mathias Niepert
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Accelerating Transformers with Spectrum-Preserving Token Merging"
13 / 13 papers shown
Title
Flow Along the K-Amplitude for Generative Modeling
Weitao Du
Shuning Chang
Jiasheng Tang
Yu Rong
F. Wang
Shengchao Liu
44
0
0
27 Apr 2025
Long-VMNet: Accelerating Long-Form Video Understanding via Fixed Memory
Saket Gurukar
Asim Kadav
VLM
45
0
0
17 Mar 2025
Efficient Online Inference of Vision Transformers by Training-Free Tokenization
Leonidas Gee
Wing Yan Li
V. Sharmanska
Novi Quadrianto
ViT
77
0
0
23 Nov 2024
I-MPN: Inductive Message Passing Network for Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data
Hoang H. Le
D. M. Nguyen
Omair Shahzad Bhatti
Laszlo Kopacsi
Thinh P. Ngo
Binh T. Nguyen
Michael Barz
Daniel Sonntag
31
0
0
10 Jun 2024
Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification
Jungmin Yun
Mihyeon Kim
Youngbin Kim
60
5
0
03 Jun 2024
What Do Self-Supervised Vision Transformers Learn?
Namuk Park
Wonjae Kim
Byeongho Heo
Taekyung Kim
Sangdoo Yun
SSL
62
76
1
01 May 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
93
75
0
15 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
73
66
0
12 Jul 2021
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1