ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00641
  4. Cited By
Focal Self-attention for Local-Global Interactions in Vision
  Transformers

Focal Self-attention for Local-Global Interactions in Vision Transformers

1 July 2021
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
    ViT
ArXiv (abs)PDFHTML

Papers citing "Focal Self-attention for Local-Global Interactions in Vision Transformers"

50 / 263 papers shown
Title
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Haibo Qiu
Baosheng Yu
Dacheng Tao
3DPCViT
241
8
0
02 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional InteractionNeural Information Processing Systems (NeurIPS), 2023
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Xiao-Yu Zhang
ViT
419
39
0
01 Jun 2023
Dual Path Transformer with Partition Attention
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
216
2
0
24 May 2023
OctFormer: Octree-based Transformers for 3D Point Clouds
OctFormer: Octree-based Transformers for 3D Point CloudsACM Transactions on Graphics (TOG), 2023
Peng-Shuai Wang
ViT3DPC
259
134
0
04 May 2023
AxWin Transformer: A Context-Aware Vision Transformer Backbone with
  Axial Windows
AxWin Transformer: A Context-Aware Vision Transformer Backbone with Axial Windows
Fangjian Lin
Yizhe Ma
Sitong Wu
Long Yu
Sheng Tian
ViT
102
6
0
02 May 2023
UniNeXt: Exploring A Unified Architecture for Vision Recognition
UniNeXt: Exploring A Unified Architecture for Vision RecognitionACM Multimedia (ACM MM), 2023
Fangjian Lin
Jianlong Yuan
Sitong Wu
Fan Wang
Zhibin Wang
ViT
297
19
0
26 Apr 2023
MixPro: Data Augmentation with MaskMix and Progressive Attention
  Labeling for Vision Transformer
MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision TransformerInternational Conference on Learning Representations (ICLR), 2023
QiHao Zhao
Yangyu Huang
Wei Hu
Fan Zhang
Jing Liu
ViT
135
19
0
24 Apr 2023
MMDR: A Result Feature Fusion Object Detection Approach for Autonomous
  System
MMDR: A Result Feature Fusion Object Detection Approach for Autonomous System
Wendong Zhang
107
0
0
19 Apr 2023
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene
  Understanding
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene UnderstandingComputational Visual Media (CVM), 2023
Yu-Qi Yang
Yu-Xiao Guo
Jiangfeng Xiong
Yang Liu
Hao Pan
Peng-Shuai Wang
Xin Tong
B. Guo
ViT
276
141
0
14 Apr 2023
SpectFormer: Frequency and Attention is what you need in a Vision
  Transformer
SpectFormer: Frequency and Attention is what you need in a Vision TransformerIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Badri N. Patro
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
ViT
164
91
0
13 Apr 2023
RSIR Transformer: Hierarchical Vision Transformer using Random Sampling
  Windows and Important Region Windows
RSIR Transformer: Hierarchical Vision Transformer using Random Sampling Windows and Important Region Windows
Zhemin Zhang
Xun Gong
ViT
118
1
0
13 Apr 2023
PlantDet: A benchmark for Plant Detection in the Three-Rivers-Source
  Region
PlantDet: A benchmark for Plant Detection in the Three-Rivers-Source RegionInternational Conference on Artificial Neural Networks (ICANN), 2023
Huanhuan Li
Xuechao Zou
Yu-an Zhang
Jiangcai Zhaba
Guomei Li
Lamao Yongga
226
0
0
11 Apr 2023
MC-MLP:Multiple Coordinate Frames in all-MLP Architecture for Vision
MC-MLP:Multiple Coordinate Frames in all-MLP Architecture for Vision
Zhimin Zhu
Jianguo Zhao
Tong Mu
Yuliang Yang
Mengyu Zhu
148
0
0
08 Apr 2023
Towards an Effective and Efficient Transformer for Rain-by-snow Weather
  Removal
Towards an Effective and Efficient Transformer for Rain-by-snow Weather Removal
Tao Gao
Yuanbo Wen
Kaihao Zhang
Peng Cheng
Ting Chen
ViT
240
6
0
06 Apr 2023
Vision Transformers with Mixed-Resolution Tokenization
Vision Transformers with Mixed-Resolution Tokenization
Tomer Ronen
Omer Levy
A. Golbert
ViT
258
28
0
01 Apr 2023
Rethinking Local Perception in Lightweight Vision Transformer
Rethinking Local Perception in Lightweight Vision Transformer
Qi Fan
Huaibo Huang
Jiyang Guan
Xiao-Yu Zhang
ViT
318
48
0
31 Mar 2023
InceptionNeXt: When Inception Meets ConvNeXt
InceptionNeXt: When Inception Meets ConvNeXtComputer Vision and Pattern Recognition (CVPR), 2023
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
491
249
0
29 Mar 2023
Spherical Transformer for LiDAR-based 3D Recognition
Spherical Transformer for LiDAR-based 3D RecognitionComputer Vision and Pattern Recognition (CVPR), 2023
Xin Lai
Yukang Chen
Fanbin Lu
Jianhui Liu
Jiaya Jia
3DPC
210
198
0
22 Mar 2023
Robustifying Token Attention for Vision Transformers
Robustifying Token Attention for Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2023
Yong Guo
David Stutz
Bernt Schiele
ViT
334
34
0
20 Mar 2023
Making Vision Transformers Efficient from A Token Sparsification View
Making Vision Transformers Efficient from A Token Sparsification ViewComputer Vision and Pattern Recognition (CVPR), 2023
Shuning Chang
Pichao Wang
Ming Lin
Fan Wang
David Junhao Zhang
Rong Jin
Mike Zheng Shou
ViT
210
37
0
15 Mar 2023
Pretrained ViTs Yield Versatile Representations For Medical Images
Pretrained ViTs Yield Versatile Representations For Medical Images
Christos Matsoukas
Johan Fredin Haslum
Magnus P Soderberg
Kevin Smith
MedImViT
265
16
0
13 Mar 2023
TransMatting: Tri-token Equipped Transformer Model for Image Matting
TransMatting: Tri-token Equipped Transformer Model for Image Matting
Huanqia Cai
Fanglei Xue
Lele Xu
Lili Guo
ViT
151
3
0
11 Mar 2023
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Large-scale Multi-Modal Pre-trained Models: A Comprehensive SurveyMachine Intelligence Research (MIR), 2023
Tianlin Li
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CEVLM
404
268
0
20 Feb 2023
Stitchable Neural Networks
Stitchable Neural NetworksComputer Vision and Pattern Recognition (CVPR), 2023
Zizheng Pan
Jianfei Cai
Bohan Zhuang
231
39
0
13 Feb 2023
CEDNet: A Cascade Encoder-Decoder Network for Dense Prediction
CEDNet: A Cascade Encoder-Decoder Network for Dense PredictionPattern Recognition (Pattern Recogn.), 2023
Qiang Chen
Zi-Hua Li
Chufeng Tang
Jianmin Li
Xiaolin Hu
285
31
0
13 Feb 2023
Dual Memory Units with Uncertainty Regulation for Weakly Supervised
  Video Anomaly Detection
Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly DetectionAAAI Conference on Artificial Intelligence (AAAI), 2023
Hang Zhou
Junqing Yu
Wei Yang
138
138
0
10 Feb 2023
DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition
DilateFormer: Multi-Scale Dilated Transformer for Visual RecognitionIEEE transactions on multimedia (IEEE TMM), 2023
Jiayu Jiao
Yuyao Tang
Kun-Li Channing Lin
Yipeng Gao
Jinhua Ma
Yaowei Wang
Wei-Shi Zheng
MedImViT
181
239
0
03 Feb 2023
Fairness-aware Vision Transformer via Debiased Self-Attention
Fairness-aware Vision Transformer via Debiased Self-AttentionEuropean Conference on Computer Vision (ECCV), 2023
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
240
10
0
31 Jan 2023
A Survey of Advanced Computer Vision Techniques for Sports
A Survey of Advanced Computer Vision Techniques for Sports
Tiago Mendes-Neves
Luís Meireles
João Mendes-Moreira
246
5
0
18 Jan 2023
FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection
FGAHOI: Fine-Grained Anchors for Human-Object Interaction DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Shuailei Ma
Yuefeng Wang
Shanze Wang
Ying-yu Wei
211
48
0
08 Jan 2023
Unsupervised 4D LiDAR Moving Object Segmentation in Stationary Settings
  with Multivariate Occupancy Time Series
Unsupervised 4D LiDAR Moving Object Segmentation in Stationary Settings with Multivariate Occupancy Time SeriesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
T. Kreutz
M. Mühlhäuser
Alejandro Sánchez Guinea
245
15
0
30 Dec 2022
A Close Look at Spatial Modeling: From Attention to Convolution
A Close Look at Spatial Modeling: From Attention to Convolution
Xu Ma
Huan Wang
Can Qin
Kunpeng Li
Xing Zhao
Jie Fu
Yun Fu
ViT3DPC
142
13
0
23 Dec 2022
What Makes for Good Tokenizers in Vision Transformer?
What Makes for Good Tokenizers in Vision Transformer?IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Shengju Qian
Yi Zhu
Wenbo Li
Mu Li
Jiaya Jia
ViT
187
17
0
21 Dec 2022
Focal-UNet: UNet-like Focal Modulation for Medical Image Segmentation
Focal-UNet: UNet-like Focal Modulation for Medical Image Segmentation
Mohammadreza Naderi
Mohammad H. Givkashi
F. Piri
N. Karimi
S. Samavi
ViTMedIm
203
19
0
19 Dec 2022
Most Important Person-guided Dual-branch Cross-Patch Attention for Group
  Affect Recognition
Most Important Person-guided Dual-branch Cross-Patch Attention for Group Affect RecognitionIEEE International Conference on Computer Vision (ICCV), 2022
Hongxia Xie
Ming-Xian Lee
Tzu-Jui Chen
Hung-Jen Chen
Hou-I Liu
Hong-Han Shuai
Wen-Huang Cheng
CVBM
184
11
0
14 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group
  Propagation
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group PropagationInternational Conference on Learning Representations (ICLR), 2022
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
Xinyu Wang
ViT
242
31
0
13 Dec 2022
Mitigation of Spatial Nonstationarity with Vision Transformers
Mitigation of Spatial Nonstationarity with Vision TransformersComputational Geosciences (Comput. Geosci.), 2022
Lei Liu
Javier E. Santos
Mavsa Prodanović
Michael J. Pyrcz
104
7
0
09 Dec 2022
Asymmetric Cross-Scale Alignment for Text-Based Person Search
Asymmetric Cross-Scale Alignment for Text-Based Person SearchIEEE transactions on multimedia (IEEE TMM), 2022
Zhong Ji
Junhua Hu
Deyin Liu
Yuan Wu
Ye Zhao
206
64
0
26 Nov 2022
Cross Aggregation Transformer for Image Restoration
Cross Aggregation Transformer for Image RestorationNeural Information Processing Systems (NeurIPS), 2022
Zheng Chen
Yulun Zhang
Jinjin Gu
Yongbing Zhang
Lingyu Kong
X. Yuan
ViT
193
201
0
24 Nov 2022
Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token
  Migration
Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token MigrationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yunjie Tian
Lingxi Xie
Jihao Qiu
Jianbin Jiao
Yaowei Wang
Qi Tian
Qixiang Ye
ViT
166
19
0
23 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
215
209
0
22 Nov 2022
N-Gram in Swin Transformers for Efficient Lightweight Image
  Super-Resolution
N-Gram in Swin Transformers for Efficient Lightweight Image Super-ResolutionComputer Vision and Pattern Recognition (CVPR), 2022
Haram Choi
Jeong-Sik Lee
Jihoon Yang
ViT
191
128
0
21 Nov 2022
Vision Transformer with Super Token Sampling
Vision Transformer with Super Token Sampling
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Xiao-Yu Zhang
Tieniu Tan
ViT
254
96
0
21 Nov 2022
Token Transformer: Can class token help window-based transformer build
  better long-range interactions?
Token Transformer: Can class token help window-based transformer build better long-range interactions?
Jia-ju Mao
Yuan Chang
Xuesong Yin
165
0
0
11 Nov 2022
InternImage: Exploring Large-Scale Vision Foundation Models with
  Deformable Convolutions
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable ConvolutionsComputer Vision and Pattern Recognition (CVPR), 2022
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Jiaming Song
Xiaogang Wang
Yu Qiao
VLM
501
940
0
10 Nov 2022
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision
  Transformer Acceleration with a Linear Taylor Attention
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor AttentionInternational Symposium on High-Performance Computer Architecture (HPCA), 2022
Jyotikrishna Dass
Shang Wu
Huihong Shi
Chaojian Li
Zhifan Ye
Zhongfeng Wang
Yingyan Lin
168
75
0
09 Nov 2022
State-of-the-art Models for Object Detection in Various Fields of
  Application
State-of-the-art Models for Object Detection in Various Fields of Application
S. A. G. Naqvi
Syed Shahnawaz Ali
ObjDOOD
220
0
0
01 Nov 2022
ViT-LSLA: Vision Transformer with Light Self-Limited-Attention
ViT-LSLA: Vision Transformer with Light Self-Limited-Attention
Zhenzhe Hechen
Wei Huang
Yixin Zhao
ViT
111
9
0
31 Oct 2022
Contextual Learning in Fourier Complex Field for VHR Remote Sensing
  Images
Contextual Learning in Fourier Complex Field for VHR Remote Sensing ImagesIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Yan Zhang
Xiyuan Gao
Qingyan Duan
Jiaxu Leng
Xiao Pu
Xinbo Gao
ViT
138
1
0
28 Oct 2022
Explicitly Increasing Input Information Density for Vision Transformers
  on Small Datasets
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
140
3
0
25 Oct 2022
Previous
123456
Next