Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.02034
Cited By
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
3 June 2021
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification"
50 / 128 papers shown
Title
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
Qingqing Cao
Bhargavi Paranjape
Hannaneh Hajishirzi
MLLM
VLM
8
20
0
27 May 2023
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
Hongjie Wang
Bhishma Dedhia
N. Jha
ViT
VLM
36
26
0
27 May 2023
Do We Really Need a Large Number of Visual Prompts?
Youngeun Kim
Yuhang Li
Abhishek Moitra
Ruokai Yin
Priyadarshini Panda
VLM
VPVLM
40
5
0
26 May 2023
PastNet: Introducing Physical Inductive Biases for Spatio-temporal Video Prediction
Hao Wu
Wei Xion
Fan Xu
Xian-Sheng Hua
C. L. Philip Chen
Xiansheng Hua
AI4TS
26
27
0
19 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
25
23
0
18 May 2023
AutoFocusFormer: Image Segmentation off the Grid
Chen Ziwen
K. Patnaik
Shuangfei Zhai
Alvin Wan
Zhile Ren
A. Schwing
Alex Colburn
Li Fuxin
17
9
0
24 Apr 2023
Efficient Video Action Detection with Token Dropout and Context Refinement
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
36
14
0
17 Apr 2023
RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Jiahao Wang
Songyang Zhang
Yong Liu
Taiqiang Wu
Yujiu Yang
Xihui Liu
Kai-xiang Chen
Ping Luo
Dahua Lin
30
20
0
12 Apr 2023
DynamicDet: A Unified Dynamic Architecture for Object Detection
Zhi-Hao Lin
Yongtao Wang
Jinhe Zhang
Xiaojie Chu
ObjD
23
30
0
12 Apr 2023
Distilling Token-Pruned Pose Transformer for 2D Human Pose Estimation
Feixiang Ren
ViT
19
2
0
12 Apr 2023
Learning Dynamic Style Kernels for Artistic Style Transfer
Wenju Xu
Chengjiang Long
Yongwei Nie
23
14
0
02 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
21
46
0
30 Mar 2023
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming Yang
F. Khan
ViT
42
84
0
27 Mar 2023
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
Peng Jin
Jinfa Huang
Pengfei Xiong
Shangxuan Tian
Chang-rui Liu
Xiang Ji
Li-ming Yuan
Jie Chen
30
49
0
25 Mar 2023
Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting
Gen Li
Jie Ji
Minghai Qin
Wei Niu
Bin Ren
Fatemeh Afghah
Lin Guo
Xiaolong Ma
SupR
95
11
0
15 Mar 2023
Token Sparsification for Faster Medical Image Segmentation
Lei Zhou
Huidong Liu
Joseph Bae
Junjun He
Dimitris Samaras
Prateek Prasanna
MedIm
24
3
0
11 Mar 2023
Efficient Transformer-based 3D Object Detection with Dynamic Token Halting
Mao Ye
Gregory P. Meyer
Yuning Chai
Qiang Liu
32
8
0
09 Mar 2023
Training-Free Acceleration of ViTs with Delayed Spatial Merging
J. Heo
Seyedarmin Azizi
A. Fayyazi
Massoud Pedram
36
3
0
04 Mar 2023
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Weidong Chen
Xiaofen Xing
Xiangmin Xu
Jianxin Pang
Lan Du
30
38
0
27 Feb 2023
Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs
Halima Bouzidi
Mohanad Odema
Hamza Ouarnoughi
Smail Niar
Mohammad Abdullah Al Faruque
21
8
0
24 Feb 2023
A residual dense vision transformer for medical image super-resolution with segmentation-based perceptual loss fine-tuning
Jin Zhu
Guang Yang
Pietro Lio'
ViT
MedIm
29
5
0
22 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
M. Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
35
56
0
12 Feb 2023
GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer
Miao Yin
Burak Uzkent
Yilin Shen
Hongxia Jin
Bo Yuan
ViT
24
13
0
13 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
15
25
0
05 Jan 2023
Rethinking Vision Transformers for MobileNet Size and Speed
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
30
159
0
15 Dec 2022
Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification
Yuteng Ye
Hang Zhou
Jiale Cai
Chenxing Gao
Youjia Zhang
Junle Wang
Qiang Hu
Junqing Yu
Wei Yang
23
6
0
27 Nov 2022
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training
Yuanze Lin
Chen Wei
Huiyu Wang
Alan Yuille
Cihang Xie
3DGS
28
15
0
21 Nov 2022
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training
Zhenglun Kong
Haoyu Ma
Geng Yuan
Mengshu Sun
Yanyue Xie
...
Tianlong Chen
Xiaolong Ma
Xiaohui Xie
Zhangyang Wang
Yanzhi Wang
ViT
26
22
0
19 Nov 2022
TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer
Zhiyang Dou
Qingxuan Wu
Chu-Hsing Lin
Zeyu Cao
Qiangqiang Wu
Weilin Wan
Taku Komura
Wenping Wang
24
39
0
19 Nov 2022
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers
Peiyan Dong
Mengshu Sun
Alec Lu
Yanyue Xie
Li-Yu Daisy Liu
...
Xin Meng
Z. Li
Xue Lin
Zhenman Fang
Yanzhi Wang
ViT
26
58
0
15 Nov 2022
ProContEXT: Exploring Progressive Context Transformer for Tracking
Jinpeng Lan
Zhi-Qi Cheng
Ju He
Chenyang Li
Bin Luo
Xueting Bao
Wangmeng Xiang
Yifeng Geng
Xuansong Xie
38
29
0
27 Oct 2022
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
28
417
0
17 Oct 2022
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Hyeong Kyu Choi
Joonmyung Choi
Hyunwoo J. Kim
ViT
28
35
0
14 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
25
17
0
11 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
25
25
0
03 Oct 2022
Learning Hierarchical Image Segmentation For Recognition and By Recognition
Tsung-Wei Ke
Sangwoo Mo
Stella X. Yu
VLM
29
9
0
01 Oct 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
26
5
0
29 Sep 2022
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
28
68
0
29 Sep 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
17
23
0
28 Sep 2022
Attacking Compressed Vision Transformers
Swapnil Parekh
Devansh Shah
Pratyush Shukla
AAML
19
1
0
28 Sep 2022
PPT: token-Pruned Pose Transformer for monocular and multi-view human pose estimation
Haoyu Ma
Zhe Wang
Yifei Chen
Deying Kong
Liangjian Chen
Xingwei Liu
Xiangyi Yan
Hao Tang
Xiaohui Xie
ViT
35
47
0
16 Sep 2022
Accelerating Vision Transformer Training via a Patch Sampling Schedule
Bradley McDanel
C. Huynh
ViT
25
1
0
19 Aug 2022
Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation
Jiaming Zhang
Kailun Yang
Haowen Shi
Simon Reiß
Kunyu Peng
Chaoxiang Ma
Haodong Fu
Philip H. S. Torr
Kaiwei Wang
Rainer Stiefelhagen
ViT
MDE
31
35
0
25 Jul 2022
Towards Efficient Adversarial Training on Vision Transformers
Boxi Wu
Jindong Gu
Zhifeng Li
Deng Cai
Xiaofei He
Wei Liu
ViT
AAML
35
37
0
21 Jul 2022
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
Yuqi Liu
Pengfei Xiong
Luhui Xu
Shengming Cao
Qin Jin
22
113
0
16 Jul 2022
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
Runsheng Xu
Zhengzhong Tu
Hao Xiang
Wei Shao
Bolei Zhou
Jiaqi Ma
44
218
0
05 Jul 2022
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
16
25
0
17 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
18
346
0
02 Jun 2022
Dynamic Linear Transformer for 3D Biomedical Image Segmentation
Zheyu Zhang
Ulas Bagci
ViT
MedIm
17
12
0
01 Jun 2022
Previous
1
2
3
Next