ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00641
  4. Cited By
Focal Self-attention for Local-Global Interactions in Vision
  Transformers

Focal Self-attention for Local-Global Interactions in Vision Transformers

1 July 2021
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
    ViT
ArXiv (abs)PDFHTML

Papers citing "Focal Self-attention for Local-Global Interactions in Vision Transformers"

50 / 262 papers shown
Title
Revisiting Noise Resilience Strategies in Gesture Recognition:
  Short-Term Enhancement in Surface Electromyographic Signal Analysis
Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in Surface Electromyographic Signal Analysis
Weiyu Guo
Ziyue Qiao
Ying Sun
Hui Xiong
83
1
0
17 Apr 2024
LUCF-Net: Lightweight U-shaped Cascade Fusion Network for Medical Image
  Segmentation
LUCF-Net: Lightweight U-shaped Cascade Fusion Network for Medical Image Segmentation
Songkai Sun
Qingshan She
Yuliang Ma
Rihui Li
Yingchun Zhang
MedIm
162
4
0
11 Apr 2024
Learning Correlation Structures for Vision Transformers
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
237
25
0
05 Apr 2024
SpiralMLP: A Lightweight Vision MLP Architecture
SpiralMLP: A Lightweight Vision MLP Architecture
Haojie Mu
Burhan Ul Tayyab
Nicholas Chua
184
1
0
31 Mar 2024
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
Donghyun Kim
Byeongho Heo
Dongyoon Han
224
31
0
28 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and
  Time-Series Analysis
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
211
4
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
264
159
0
26 Mar 2024
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature
  Interaction for Dense Predictions
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense PredictionsComputer Vision and Pattern Recognition (CVPR), 2024
Chunlong Xia
Xinliang Wang
Feng Lv
Xin Hao
Yifeng Shi
ViT
266
113
0
12 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
290
5
0
07 Mar 2024
Interactive Multi-Head Self-Attention with Linear Complexity
Interactive Multi-Head Self-Attention with Linear Complexity
Hankyul Kang
Ming-Hsuan Yang
Jongbin Ryu
155
3
0
27 Feb 2024
Multi-Human Mesh Recovery with Transformers
Multi-Human Mesh Recovery with Transformers
Zeyu Wang
Zhenzhen Weng
Serena Yeung-Levy
3DH
117
1
0
26 Feb 2024
ToDo: Token Downsampling for Efficient Generation of High-Resolution
  Images
ToDo: Token Downsampling for Efficient Generation of High-Resolution Images
Ethan Smith
Nayan Saxena
Aninda Saha
DiffM
128
7
0
21 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
378
60
0
05 Feb 2024
Do deep neural networks utilize the weight space efficiently?
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
110
0
0
26 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space ModelInternational Conference on Machine Learning (ICML), 2024
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
327
1,282
0
17 Jan 2024
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
SwinTextSpotter v2: Towards Better Synergy for Scene Text SpottingInternational Journal of Computer Vision (IJCV), 2024
Mingxin Huang
Dezhi Peng
Hongliang Li
Zhenghao Peng
Chongyu Liu
Dahua Lin
Yuliang Liu
Xiang Bai
Lianwen Jin
311
6
0
15 Jan 2024
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio
  Detection
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection
Lian Huang
Chi-Man Pun
122
10
0
11 Jan 2024
Deformable Audio Transformer for Audio Event Detection
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
151
0
0
24 Dec 2023
Agent Attention: On the Integration of Softmax and Linear Attention
Agent Attention: On the Integration of Softmax and Linear AttentionEuropean Conference on Computer Vision (ECCV), 2023
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
291
173
0
14 Dec 2023
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel
  Size might be All You Need
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need
Tianjin Huang
Tianlong Chen
Zinan Lin
Shiwei Liu
246
1
0
09 Dec 2023
DocBinFormer: A Two-Level Transformer Network for Effective Document
  Image Binarization
DocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization
Risab Biswas
Swalpa Kumar Roy
Ning Wang
Umapada Pal
Guang-Bin Huang
ViT
127
3
0
06 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
696
1
0
01 Dec 2023
GeoDeformer: Geometric Deformable Transformer for Action Recognition
GeoDeformer: Geometric Deformable Transformer for Action Recognition
Jinhui Ye
Jiaming Zhou
Hui Xiong
Junwei Liang
ViT
87
1
0
29 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2023
Dai Shi
ViT
224
233
0
28 Nov 2023
Advancing Vision Transformers with Group-Mix Attention
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Lichao Sun
Jiangliu Wang
Yibing Song
Ping Luo
302
30
0
26 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision
  Tasks
Florence-2: Advancing a Unified Representation for a Variety of Vision TasksComputer Vision and Pattern Recognition (CVPR), 2023
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
281
347
0
10 Nov 2023
Vision Big Bird: Random Sparsification for Full Attention
Vision Big Bird: Random Sparsification for Full Attention
Zhemin Zhang
Xun Gong
ViT
126
1
0
10 Nov 2023
Scattering Vision Transformer: Spectral Mixing Matters
Scattering Vision Transformer: Spectral Mixing MattersNeural Information Processing Systems (NeurIPS), 2023
Badri N. Patro
Vijay Srinivas Agneeswaran
334
27
0
02 Nov 2023
VST++: Efficient and Stronger Visual Saliency Transformer
VST++: Efficient and Stronger Visual Saliency TransformerIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Nian Liu
Ziyang Luo
Ni Zhang
Junwei Han
ViT
187
40
0
18 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-AttentionIEEE Transactions on Cybernetics (IEEE Trans. Cybern.), 2023
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
321
8
0
10 Oct 2023
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient
  Vision Transformers
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2023
Shiyue Cao
Yueqin Yin
Lianghua Huang
Yu Liu
Xin Zhao
Deli Zhao
Kaiqi Huang
ViT
212
28
0
09 Oct 2023
Single Stage Warped Cloth Learning and Semantic-Contextual Attention
  Feature Fusion for Virtual TryOn
Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOnIEEE International Conference on Multimedia and Expo (ICME), 2023
Sanhita Pathak
V. Kaushik
Brejesh Lall
DiffM
177
3
0
08 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
417
10
0
08 Oct 2023
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix
  Factorization via Plastic Transformer
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic TransformerInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023
Xiaofeng Liu
Fangxu Xing
Maureen Stone
Jiachen Zhuo
S. Fels
Jerry L. Prince
Xiaofeng Liu
Jonghye Woo
MedIm
102
3
0
26 Sep 2023
CINFormer: Transformer network with multi-stage CNN feature injection
  for surface defect segmentation
CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation
Xiaoheng Jiang
Kaiyi Guo
Yang Lu
Feng Yan
Hao Liu
Jiale Cao
Mingliang Xu
Dacheng Tao
MedImViTUQCV
152
2
0
22 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2023
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
497
159
0
20 Sep 2023
Mask-Attention-Free Transformer for 3D Instance Segmentation
Mask-Attention-Free Transformer for 3D Instance SegmentationIEEE International Conference on Computer Vision (ICCV), 2023
Xin Lai
Yuhui Yuan
Ruihang Chu
Yukang Chen
Han Hu
Jiaya Jia
MedImISeg3DPC
223
44
0
04 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image
  Modeling
RevColV2: Exploring Disentangled Representations in Masked Image ModelingNeural Information Processing Systems (NeurIPS), 2023
Qi Han
Yuxuan Cai
Xiangyu Zhang
251
13
0
02 Sep 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
254
30
0
27 Aug 2023
Vision Transformer Adapters for Generalizable Multitask Learning
Vision Transformer Adapters for Generalizable Multitask LearningIEEE International Conference on Computer Vision (ICCV), 2023
Deblina Bhattacharjee
Sabine Süsstrunk
Mathieu Salzmann
ViT
147
15
0
23 Aug 2023
SG-Former: Self-guided Transformer with Evolving Token Reallocation
SG-Former: Self-guided Transformer with Evolving Token ReallocationIEEE International Conference on Computer Vision (ICCV), 2023
Sucheng Ren
Xingyi Yang
Songhua Liu
Xinchao Wang
ViT
224
61
0
23 Aug 2023
SPANet: Frequency-balancing Token Mixer using Spectral Pooling
  Aggregation Modulation
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation ModulationIEEE International Conference on Computer Vision (ICCV), 2023
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Dong Hwan Kim
MoE
166
28
0
22 Aug 2023
Revisiting Vision Transformer from the View of Path Ensemble
Revisiting Vision Transformer from the View of Path EnsembleIEEE International Conference on Computer Vision (ICCV), 2023
Shuning Chang
Pichao Wang
Haowen Luo
Fan Wang
Mike Zheng Shou
ViT
122
7
0
12 Aug 2023
DiT: Efficient Vision Transformers with Dynamic Token Routing
DiT: Efficient Vision Transformers with Dynamic Token Routing
Yuchen Ma
Zhengcong Fei
Junshi Huang
ViT
173
2
0
07 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
PVG: Progressive Vision Graph for Vision RecognitionACM Multimedia (ACM MM), 2023
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
256
20
0
01 Aug 2023
When Multi-Task Learning Meets Partial Supervision: A Computer Vision
  Review
When Multi-Task Learning Meets Partial Supervision: A Computer Vision ReviewProceedings of the IEEE (Proc. IEEE), 2023
Maxime Fontana
Michael W. Spratling
Miaojing Shi
208
9
0
25 Jul 2023
Scale-Aware Modulation Meet Transformer
Scale-Aware Modulation Meet TransformerIEEE International Conference on Computer Vision (ICCV), 2023
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoEViT
242
119
0
17 Jul 2023
Adaptive Window Pruning for Efficient Local Motion Deblurring
Adaptive Window Pruning for Efficient Local Motion DeblurringInternational Conference on Learning Representations (ICLR), 2023
Haoying Li
Jixin Zhao
Shangchen Zhou
H. Feng
Chongyi Li
Chen Change Loy
ViT
242
9
0
25 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene
  Understanding
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene UnderstandingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Hanrong Ye
Dan Xu
ViT
201
23
0
08 Jun 2023
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Haibo Qiu
Baosheng Yu
Dacheng Tao
3DPCViT
221
8
0
02 Jun 2023
Previous
123456
Next