Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.14217
Cited By
Less is More: Pay Less Attention in Vision Transformers
29 May 2021
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Less is More: Pay Less Attention in Vision Transformers"
18 / 18 papers shown
Title
Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding
Yiming Zhang
Zhuokai Zhao
Zhaorun Chen
Zenghui Ding
Xianjun Yang
Yining Sun
157
1
0
21 Nov 2024
Brain-Inspired Stepwise Patch Merging for Vision Transformers
Yonghao Yu
Dongcheng Zhao
Guobin Shen
Yiting Dong
Yi Zeng
45
0
0
11 Sep 2024
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
47
4
0
17 Feb 2024
Morphing Tokens Draw Strong Masked Image Models
Taekyung Kim
Byeongho Heo
Dongyoon Han
44
3
0
30 Dec 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
29
4
0
10 Oct 2023
Frequency Disentangled Features in Neural Image Compression
Ali Zafari
Atefeh Khoshkhahtinat
P. Mehta
Mohammad Saeed Ebrahimi Saadabadi
Mohammad Akyash
Nasser M. Nasrabadi
34
14
0
04 Aug 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
31
28
0
01 Jun 2023
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
ViT
26
149
0
24 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
23
24
0
02 Mar 2023
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
21
6
0
16 Feb 2023
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations
Tan Yu
Ping Li
ViT
36
5
0
25 Nov 2022
Rega-Net:Retina Gabor Attention for Deep Convolutional Neural Networks
Chun Bao
Jie Cao
Yaqian Ning
Yang Cheng
Q. Hao
21
1
0
23 Nov 2022
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
38
3
0
25 Oct 2022
SWAT: Spatial Structure Within and Among Tokens
Kumara Kahatapitiya
Michael S. Ryoo
23
6
0
26 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
29
40
0
23 Nov 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,622
0
24 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
270
979
0
27 Jan 2021
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
249
1,824
0
18 Aug 2016
1