ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00641
  4. Cited By
Focal Self-attention for Local-Global Interactions in Vision
  Transformers

Focal Self-attention for Local-Global Interactions in Vision Transformers

1 July 2021
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
    ViT
ArXivPDFHTML

Papers citing "Focal Self-attention for Local-Global Interactions in Vision Transformers"

50 / 259 papers shown
Title
CenterFormer: Center-based Transformer for 3D Object Detection
CenterFormer: Center-based Transformer for 3D Object Detection
Zixiang Zhou
Xian Zhao
Yu Wang
Panqu Wang
H. Foroosh
3DPC
ViT
8
134
0
12 Sep 2022
MAFormer: A Transformer Network with Multi-scale Attention Fusion for
  Visual Recognition
MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual Recognition
Y. Wang
H. Sun
Xiaodi Wang
Bin Zhang
Chaonan Li
Ying Xin
Baochang Zhang
Errui Ding
Shumin Han
ViT
23
9
0
31 Aug 2022
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image
  Pretraining
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
VLM
32
157
0
25 Aug 2022
Efficient Attention-free Video Shift Transformers
Efficient Attention-free Video Shift Transformers
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
ViT
27
1
0
23 Aug 2022
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze
  Estimation
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation
Bolin Lai
Miao Liu
Fiona Ryan
James M. Rehg
ViT
30
32
0
08 Aug 2022
TransMatting: Enhancing Transparent Objects Matting with Transformers
TransMatting: Enhancing Transparent Objects Matting with Transformers
Huanqia Cai
Fanglei Xue
Lele Xu
Lili Guo
ViT
11
20
0
05 Aug 2022
TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object
  Detection
TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection
Zhipeng Luo
Gongjie Zhang
Changqing Zhou
Ti Liu
Shijian Lu
Liang Pan
3DPC
ViT
48
9
0
04 Aug 2022
giMLPs: Gate with Inhibition Mechanism in MLPs
Cheng Kang
Jindich Prokop
Lei Tong
Huiyu Zhou
Yong Hu
Daneil Novak
14
0
0
01 Aug 2022
Global-Local Self-Distillation for Visual Representation Learning
Global-Local Self-Distillation for Visual Representation Learning
Tim Lebailly
Tinne Tuytelaars
SSL
30
6
0
29 Jul 2022
COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram
  Transformers
COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram Transformers
Idil Aytekin
Onat Dalmaz
Kaan Gonc
H. Ankishan
E. Saritas
Ulas Bagci
H. Celik
Tolga Çukur
16
12
0
19 Jul 2022
Earthformer: Exploring Space-Time Transformers for Earth System
  Forecasting
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
Zhihan Gao
Xingjian Shi
Hao Wang
Yi Zhu
Yuyang Wang
Mu Li
Dit-Yan Yeung
AI4TS
31
145
0
12 Jul 2022
LightViT: Towards Light-Weight Convolution-Free Vision Transformers
LightViT: Towards Light-Weight Convolution-Free Vision Transformers
Tao Huang
Lang Huang
Shan You
Fei Wang
Chao Qian
Chang Xu
ViT
17
55
0
12 Jul 2022
Compound Prototype Matching for Few-shot Action Recognition
Compound Prototype Matching for Few-shot Action Recognition
Yifei Huang
Lijin Yang
Yoichi Sato
14
43
0
12 Jul 2022
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation
  Learning
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Ting Yao
Yingwei Pan
Yehao Li
Chong-Wah Ngo
Tao Mei
ViT
146
136
0
11 Jul 2022
Dual Vision Transformer
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
141
75
0
11 Jul 2022
Self-attention on Multi-Shifted Windows for Scene Segmentation
Self-attention on Multi-Shifted Windows for Scene Segmentation
Litao Yu
Zhibin Li
Jian Andrew Zhang
Qiang Wu
SSeg
11
1
0
10 Jul 2022
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse
  Transformers
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
Runsheng Xu
Zhengzhong Tu
Hao Xiang
Wei Shao
Bolei Zhou
Jiaqi Ma
28
216
0
05 Jul 2022
Improving Semantic Segmentation in Transformers using Hierarchical
  Inter-Level Attention
Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention
Gary Leung
Jun Gao
Xiaohui Zeng
Sanja Fidler
13
3
0
05 Jul 2022
Polarized Color Image Denoising using Pocoformer
Zhuoxiao Li
Hai-bo Jiang
Yinqiang Zheng
24
3
0
01 Jul 2022
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Cheng-rong Li
Yangxin Liu
21
0
0
01 Jul 2022
Deformable Graph Transformer
Deformable Graph Transformer
Jinyoung Park
Seongjun Yun
Hyeon-ju Park
Jaewoo Kang
Jisu Jeong
KyungHyun Kim
Jung-Woo Ha
Hyunwoo J. Kim
82
7
0
29 Jun 2022
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
Yukang Chen
Jianhui Liu
X. Zhang
Xiaojuan Qi
Jiaya Jia
39
85
0
21 Jun 2022
Vicinity Vision Transformer
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
34
31
0
21 Jun 2022
Learning Multiscale Transformer Models for Sequence Generation
Learning Multiscale Transformer Models for Sequence Generation
Bei Li
Tong Zheng
Yi Jing
Chengbo Jiao
Tong Xiao
Jingbo Zhu
24
9
0
19 Jun 2022
Efficient Decoder-free Object Detection with Transformers
Efficient Decoder-free Object Detection with Transformers
Peixian Chen
Mengdan Zhang
Yunhang Shen
Kekai Sheng
Yuting Gao
Xing Sun
Ke Li
Chunhua Shen
ViT
34
16
0
14 Jun 2022
Peripheral Vision Transformer
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
22
30
0
14 Jun 2022
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Jun Chen
Ming Hu
Boyang Albert Li
Mohamed Elhoseiny
32
36
0
01 Jun 2022
Self-Supervised Pre-training of Vision Transformers for Dense Prediction
  Tasks
Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks
Jaonary Rabarisoa
Velentin Belissen
Florian Chabot
Q. C. Pham
VLM
ViT
SSL
MDE
8
2
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
52
26
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing
  Mechanisms in Sequence Learning
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
110
17
0
30 May 2022
Fast Vision Transformers with HiLo Attention
Fast Vision Transformers with HiLo Attention
Zizheng Pan
Jianfei Cai
Bohan Zhuang
28
151
0
26 May 2022
Inception Transformer
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
22
187
0
25 May 2022
ASSET: Autoregressive Semantic Scene Editing with Transformers at High
  Resolutions
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions
Difan Liu
Sandesh Shetty
Tobias Hinz
Matthew Fisher
Richard Y. Zhang
Taesung Park
E. Kalogerakis
ViT
21
29
0
24 May 2022
BolT: Fused Window Transformers for fMRI Time Series Analysis
BolT: Fused Window Transformers for fMRI Time Series Analysis
H. Bedel
Irmak Sivgin
Onat Dalmaz
S. Dar
Tolga Çukur
51
54
0
23 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision
  Transformers with Locality
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
95
73
0
20 May 2022
Vision Transformer Adapter for Dense Predictions
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
36
540
0
17 May 2022
MulT: An End-to-End Multitask Learning Transformer
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
29
62
0
17 May 2022
Transformers in 3D Point Clouds: A Survey
Transformers in 3D Point Clouds: A Survey
Dening Lu
Qian Xie
Mingqiang Wei
Kyle Gao
Linlin Xu
Jonathan Li
3DPC
ViT
30
49
0
16 May 2022
Transformer Scale Gate for Semantic Segmentation
Transformer Scale Gate for Semantic Segmentation
Hengcan Shi
Munawar Hayat
Jianfei Cai
ViT
25
22
0
14 May 2022
Adaptive Split-Fusion Transformer
Adaptive Split-Fusion Transformer
Zixuan Su
Hao Zhang
Jingjing Chen
Lei Pang
Chong-Wah Ngo
Yu-Gang Jiang
ViT
11
7
0
26 Apr 2022
OutfitTransformer: Learning Outfit Representations for Fashion
  Recommendation
OutfitTransformer: Learning Outfit Representations for Fashion Recommendation
Rohan Sarkar
Navaneeth Bodla
Mariya I. Vasileva
Yen-Liang Lin
Anu Beniwal
Alan Lu
Gérard Medioni
8
35
0
11 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
25
239
0
07 Apr 2022
Learning Local and Global Temporal Contexts for Video Semantic
  Segmentation
Learning Local and Global Temporal Contexts for Video Semantic Segmentation
Guolei Sun
Yun Liu
Henghui Ding
Min Wu
Luc Van Gool
25
32
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for
  Object Detection
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
11
55
0
06 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDE
ViT
8
101
0
06 Apr 2022
MaxViT: Multi-Axis Vision Transformer
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
43
632
0
04 Apr 2022
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He
Jianfei Cai
Zizheng Pan
Jing Liu
Jing Zhang
Dacheng Tao
Bohan Zhuang
29
16
0
04 Apr 2022
COOL, a Context Outlooker, and its Application to Question Answering and
  other Natural Language Processing Tasks
COOL, a Context Outlooker, and its Application to Question Answering and other Natural Language Processing Tasks
Fangyi Zhu
See-Kiong Ng
S. Bressan
LRM
14
1
0
01 Apr 2022
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
Gyutae Park
S. Son
Jaeyoung Yoo
Seho Kim
Nojun Kwak
ViT
17
65
0
29 Mar 2022
Stratified Transformer for 3D Point Cloud Segmentation
Stratified Transformer for 3D Point Cloud Segmentation
Xin Lai
Jianhui Liu
Li Jiang
Liwei Wang
Hengshuang Zhao
Shu-Lin Liu
Xiaojuan Qi
Jiaya Jia
3DPC
ViT
24
257
0
28 Mar 2022
Previous
123456
Next