Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2107.00641
Cited By
Focal Self-attention for Local-Global Interactions in Vision Transformers
1 July 2021
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Focal Self-attention for Local-Global Interactions in Vision Transformers"
50 / 262 papers shown
Title
Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in Surface Electromyographic Signal Analysis
Weiyu Guo
Ziyue Qiao
Ying Sun
Hui Xiong
83
1
0
17 Apr 2024
LUCF-Net: Lightweight U-shaped Cascade Fusion Network for Medical Image Segmentation
Songkai Sun
Qingshan She
Yuliang Ma
Rihui Li
Yingchun Zhang
MedIm
162
4
0
11 Apr 2024
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
237
25
0
05 Apr 2024
SpiralMLP: A Lightweight Vision MLP Architecture
Haojie Mu
Burhan Ul Tayyab
Nicholas Chua
184
1
0
31 Mar 2024
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
Donghyun Kim
Byeongho Heo
Dongyoon Han
224
31
0
28 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
211
4
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
264
159
0
26 Mar 2024
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
Computer Vision and Pattern Recognition (CVPR), 2024
Chunlong Xia
Xinliang Wang
Feng Lv
Xin Hao
Yifeng Shi
ViT
266
113
0
12 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
290
5
0
07 Mar 2024
Interactive Multi-Head Self-Attention with Linear Complexity
Hankyul Kang
Ming-Hsuan Yang
Jongbin Ryu
155
3
0
27 Feb 2024
Multi-Human Mesh Recovery with Transformers
Zeyu Wang
Zhenzhen Weng
Serena Yeung-Levy
3DH
117
1
0
26 Feb 2024
ToDo: Token Downsampling for Efficient Generation of High-Resolution Images
Ethan Smith
Nayan Saxena
Aninda Saha
DiffM
128
7
0
21 Feb 2024
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
378
60
0
05 Feb 2024
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
110
0
0
26 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
International Conference on Machine Learning (ICML), 2024
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
327
1,282
0
17 Jan 2024
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
International Journal of Computer Vision (IJCV), 2024
Mingxin Huang
Dezhi Peng
Hongliang Li
Zhenghao Peng
Chongyu Liu
Dahua Lin
Yuliang Liu
Xiang Bai
Lianwen Jin
311
6
0
15 Jan 2024
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection
Lian Huang
Chi-Man Pun
122
10
0
11 Jan 2024
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
151
0
0
24 Dec 2023
Agent Attention: On the Integration of Softmax and Linear Attention
European Conference on Computer Vision (ECCV), 2023
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
291
173
0
14 Dec 2023
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need
Tianjin Huang
Tianlong Chen
Zinan Lin
Shiwei Liu
246
1
0
09 Dec 2023
DocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization
Risab Biswas
Swalpa Kumar Roy
Ning Wang
Umapada Pal
Guang-Bin Huang
ViT
127
3
0
06 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
696
1
0
01 Dec 2023
GeoDeformer: Geometric Deformable Transformer for Action Recognition
Jinhui Ye
Jiaming Zhou
Hui Xiong
Junwei Liang
ViT
87
1
0
29 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2023
Dai Shi
ViT
224
233
0
28 Nov 2023
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Lichao Sun
Jiangliu Wang
Yibing Song
Ping Luo
302
30
0
26 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Computer Vision and Pattern Recognition (CVPR), 2023
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
281
347
0
10 Nov 2023
Vision Big Bird: Random Sparsification for Full Attention
Zhemin Zhang
Xun Gong
ViT
126
1
0
10 Nov 2023
Scattering Vision Transformer: Spectral Mixing Matters
Neural Information Processing Systems (NeurIPS), 2023
Badri N. Patro
Vijay Srinivas Agneeswaran
334
27
0
02 Nov 2023
VST++: Efficient and Stronger Visual Saliency Transformer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Nian Liu
Ziyang Luo
Ni Zhang
Junwei Han
ViT
187
40
0
18 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
IEEE Transactions on Cybernetics (IEEE Trans. Cybern.), 2023
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
321
8
0
10 Oct 2023
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2023
Shiyue Cao
Yueqin Yin
Lianghua Huang
Yu Liu
Xin Zhao
Deli Zhao
Kaiqi Huang
ViT
212
28
0
09 Oct 2023
Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOn
IEEE International Conference on Multimedia and Expo (ICME), 2023
Sanhita Pathak
V. Kaushik
Brejesh Lall
DiffM
177
3
0
08 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
417
10
0
08 Oct 2023
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023
Xiaofeng Liu
Fangxu Xing
Maureen Stone
Jiachen Zhuo
S. Fels
Jerry L. Prince
Xiaofeng Liu
Jonghye Woo
MedIm
102
3
0
26 Sep 2023
CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation
Xiaoheng Jiang
Kaiyi Guo
Yang Lu
Feng Yan
Hao Liu
Jiale Cao
Mingliang Xu
Dacheng Tao
MedIm
ViT
UQCV
152
2
0
22 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2023
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
497
159
0
20 Sep 2023
Mask-Attention-Free Transformer for 3D Instance Segmentation
IEEE International Conference on Computer Vision (ICCV), 2023
Xin Lai
Yuhui Yuan
Ruihang Chu
Yukang Chen
Han Hu
Jiaya Jia
MedIm
ISeg
3DPC
223
44
0
04 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Neural Information Processing Systems (NeurIPS), 2023
Qi Han
Yuxuan Cai
Xiangyu Zhang
251
13
0
02 Sep 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
254
30
0
27 Aug 2023
Vision Transformer Adapters for Generalizable Multitask Learning
IEEE International Conference on Computer Vision (ICCV), 2023
Deblina Bhattacharjee
Sabine Süsstrunk
Mathieu Salzmann
ViT
147
15
0
23 Aug 2023
SG-Former: Self-guided Transformer with Evolving Token Reallocation
IEEE International Conference on Computer Vision (ICCV), 2023
Sucheng Ren
Xingyi Yang
Songhua Liu
Xinchao Wang
ViT
224
61
0
23 Aug 2023
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation
IEEE International Conference on Computer Vision (ICCV), 2023
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Dong Hwan Kim
MoE
166
28
0
22 Aug 2023
Revisiting Vision Transformer from the View of Path Ensemble
IEEE International Conference on Computer Vision (ICCV), 2023
Shuning Chang
Pichao Wang
Haowen Luo
Fan Wang
Mike Zheng Shou
ViT
122
7
0
12 Aug 2023
DiT: Efficient Vision Transformers with Dynamic Token Routing
Yuchen Ma
Zhengcong Fei
Junshi Huang
ViT
173
2
0
07 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
ACM Multimedia (ACM MM), 2023
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
256
20
0
01 Aug 2023
When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review
Proceedings of the IEEE (Proc. IEEE), 2023
Maxime Fontana
Michael W. Spratling
Miaojing Shi
208
9
0
25 Jul 2023
Scale-Aware Modulation Meet Transformer
IEEE International Conference on Computer Vision (ICCV), 2023
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
242
119
0
17 Jul 2023
Adaptive Window Pruning for Efficient Local Motion Deblurring
International Conference on Learning Representations (ICLR), 2023
Haoying Li
Jixin Zhao
Shangchen Zhou
H. Feng
Chongyi Li
Chen Change Loy
ViT
242
9
0
25 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Hanrong Ye
Dan Xu
ViT
201
23
0
08 Jun 2023
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Haibo Qiu
Baosheng Yu
Dacheng Tao
3DPC
ViT
221
8
0
02 Jun 2023
Previous
1
2
3
4
5
6
Next