Papers
Communities
Organizations
Events
Blog
Pricing
Feedback
Contact Sales
Search
Open menu
Home
Papers
All Papers
Title
Home
Papers
2304.10716
Cited By
Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers
21 April 2023
Siyuan Wei
Tianzhu Ye
Shen Zhang
Yao Tang
Jiajun Liang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (45★)
Papers citing
"Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers"
47 / 47 papers shown
Title
Dual-Priv Pruning : Efficient Differential Private Fine-Tuning in Multimodal Large Language Models
Qianshan Wei
Jiaqi Li
Zihan You
Yi Zhan
Kecen Li
...
Yi Yu
Bin Cao
Yiwen Xu
Teli Ma
Guilin Qi
AAML
VLM
63
0
0
08 Jun 2025
Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration
Fanhu Zeng
Deli Yu
Zhenglun Kong
Hao Tang
ViT
98
3
0
06 Jun 2025
S2AFormer: Strip Self-Attention for Efficient Vision Transformer
Guoan Xu
Wenfeng Huang
Wenjing Jia
Jiamao Li
Guangwei Gao
Guo-Jun Qi
119
0
0
28 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
661
2
0
06 May 2025
Dynamic Vision Mamba
Mengxuan Wu
Zekai Li
Zhiyuan Liang
Moyang Li
Xuanlei Zhao
...
Xiaojiang Peng
Konstantinos N. Plataniotis
Xiaojiang Peng
Wangbo Zhao
Yang You
Mamba
132
2
0
07 Apr 2025
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Junzhu Mao
Yang Shen
Jinyang Guo
Yazhou Yao
Xiansheng Hua
ViT
179
0
0
30 Mar 2025
Similarity-Aware Token Pruning: Your VLM but Faster
Ahmadreza Jeddi
Negin Baghbanzadeh
Elham Dolatabadi
Babak Taati
3DV
VLM
151
3
0
14 Mar 2025
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
Yudong Liu
Jingwei Sun
Yueqian Lin
Jingyang Zhang
Ming Yin
Qinsi Wang
Jing Zhang
Haoyang Li
Yiran Chen
VLM
271
2
0
13 Mar 2025
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Junwei Luo
Yingying Zhang
Xiaoyu Yang
Kang Wu
Qi Zhu
Lei Liang
Jingdong Chen
Yansheng Li
233
3
0
10 Mar 2025
Not Every Patch is Needed: Towards a More Efficient and Effective Backbone for Video-based Person Re-identification
Lanyun Zhu
Tianrun Chen
Deyi Ji
Jieping Ye
Jing Liu
210
3
0
28 Jan 2025
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Yining Qi
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
207
4
0
18 Dec 2024
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang
Hui Chen
Jianchao Tan
Jianchao Tan
Xunliang Cai
Zijia Lin
Jiawei Han
Jungong Han
Guiguang Ding
VLM
225
3
0
04 Dec 2024
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
Yiwu Zhong
Zhuoming Liu
Yin Li
Liwei Wang
208
13
0
04 Dec 2024
Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable
Lizhen Xu
Shanmin Pang
Wenzhao Qiu
Zehao Wu
Xiuxiu Bai
K. Mei
Jianru Xue
189
2
0
03 Dec 2024
Token Cropr: Faster ViTs for Quite a Few Tasks
Benjamin Bergner
Christoph Lippert
Aravindh Mahendran
ViT
VLM
195
3
0
01 Dec 2024
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy
Te Yang
Jian Jia
Xiangyu Zhu
Weisong Zhao
Bo Wang
...
Shengyuan Liu
Quan Chen
Peng Jiang
Kun Gai
Zhen Lei
102
2
0
23 Nov 2024
Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
Zixin Wang
Dong Gong
Sen Wang
Zi Huang
Yadan Luo
VLM
139
1
0
16 Oct 2024
Patch Ranking: Efficient CLIP by Learning to Rank Local Patches
Cheng-En Wu
Jinhong Lin
Yu Hen Hu
Pedro Morgado
VLM
72
2
0
22 Sep 2024
Agglomerative Token Clustering
Joakim Bruslund Haurum
Sergio Escalera
Graham W. Taylor
T. Moeslund
131
6
0
18 Sep 2024
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models
Yiyi Zhou
Qiong Wu
Wenhao Lin
Weihao Ye
VLM
163
25
0
16 Sep 2024
Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer
Shuai Peng
Di Fu
Baole Wei
Yong Cao
Liangcai Gao
Zhi Tang
ViT
98
3
0
30 Aug 2024
Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning
Shibo Jie
Yehui Tang
Jianyuan Guo
Zhi-Hong Deng
Kai Han
Yunhe Wang
VLM
111
6
0
13 Aug 2024
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju
Haicheng Wang
Haozhe Cheng
Xu Chen
Zhonghua Zhai
Weilin Huang
Jinsong Lan
Shuai Xiao
Bo Zheng
VLM
124
8
0
16 Jul 2024
Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge
Nick Eliopoulos
Purvish Jajal
James Davis
Gaowen Liu
George K. Thiravathukal
Yung-Hsiang Lu
104
4
0
01 Jul 2024
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming
Jiaxin Zhang
Wentao Yang
Songxuan Lai
Zecheng Xie
Lianwen Jin
124
25
0
27 Jun 2024
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Zhongwei Wan
Ziang Wu
Che Liu
Jinfa Huang
Zhihong Zhu
Peng Jin
Longyue Wang
Li Yuan
VLM
139
48
0
26 Jun 2024
Speeding Up Image Classifiers with Little Companions
Yang Liu
Kowshik Thopalli
Jayaraman J. Thiagarajan
VLM
114
0
0
24 Jun 2024
D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Zhongwei Wan
Xinjian Wu
Yu Zhang
Yi Xin
Chaofan Tao
...
Xin Wang
Siqi Luo
Jing Xiong
Mi Zhang
Mi Zhang
160
2
0
18 Jun 2024
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Narges Norouzi
Svetlana Orlova
Daan de Geus
Gijs Dubbelman
ViT
FedML
98
9
0
14 Jun 2024
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
Xiangyang Yang
Dan Zeng
Xucheng Wang
You Wu
Hengzhou Ye
Qijun Zhao
Shuiwang Li
161
10
0
12 Jun 2024
Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification
Jungmin Yun
Mihyeon Kim
Youngbin Kim
147
10
0
03 Jun 2024
Automatic Channel Pruning for Multi-Head Attention
Eunho Lee
Youngbae Hwang
ViT
104
1
0
31 May 2024
Accelerating Transformers with Spectrum-Preserving Token Merging
Hoai-Chau Tran
D. M. Nguyen
Duy M. Nguyen
Trung Thanh Nguyen
Ngan Le
Pengtao Xie
Daniel Sonntag
James Y. Zou
Binh T. Nguyen
Mathias Niepert
134
16
0
25 May 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Akide Liu
Jing Liu
Zizheng Pan
Yefei He
Gholamreza Haffari
Bohan Zhuang
MQ
127
43
0
23 May 2024
Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics
Haosong Peng
Wei Feng
Hao Li
Yufeng Zhan
Qihua Zhou
Yuanqing Xia
81
3
0
14 Apr 2024
MLP Can Be A Good Transformer Learner
Sihao Lin
Pumeng Lyu
Dongrui Liu
Tao Tang
Xiaodan Liang
Andy Song
Xiaojun Chang
ViT
125
14
0
08 Apr 2024
The Need for Speed: Pruning Transformers with One Recipe
Samir Khaki
Konstantinos N. Plataniotis
122
11
0
26 Mar 2024
HCPM: Hierarchical Candidates Pruning for Efficient Detector-Free Matching
Ying Chen
Yong-Jin Liu
Kai Wu
Qiang Nie
Shang Xu
Huifang Ma
Bing Wang
Chengjie Wang
VLM
87
1
0
19 Mar 2024
MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformer
Y. Tai
An-Yeu Wu
Wu
MQ
134
7
0
26 Jan 2024
F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
Jingkuan Song
Jianzhi Liu
Lianli Gao
Jingkuan Song
DiffM
VGen
85
8
0
06 Dec 2023
Accelerating Vision Transformers Based on Heterogeneous Attention Patterns
Deli Yu
Teng Xi
Jianwei Li
Baopu Li
Gang Zhang
Haocheng Feng
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
ViT
107
2
0
11 Oct 2023
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
Yi-Lin Sung
Jaehong Yoon
Mohit Bansal
VLM
115
15
0
04 Oct 2023
PPT: Token Pruning and Pooling for Efficient Vision Transformers
Xinjian Wu
Fanhu Zeng
Xiudong Wang
Xinghao Chen
ViT
132
32
0
03 Oct 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
Jiawei Han
Guiguang Ding
ViT
76
8
0
27 Sep 2023
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework
Yuelei Wang
Ting Zhang
Liangjin Zhao
Lin Hu
Zhechao Wang
...
Kaiqiang Chen
Xuan Zeng
Zhirui Wang
Hongqi Wang
Xian Sun
109
6
0
16 Sep 2023
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
Hongjie Wang
Bhishma Dedhia
N. Jha
ViT
VLM
183
42
0
27 May 2023
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
126
47
0
23 Nov 2021
1