Focal Self-attention for Local-Global Interactions in Vision Transformers

1 July 2021

Jianwei Yang

Lu Yuan

Papers citing "Focal Self-attention for Local-Global Interactions in Vision Transformers"

50 / 263 papers shown

ViT$^3$: Unlocking Test-Time Training in Vision

ViT

^3

: Unlocking Test-Time Training in Vision

01 Dec 2025

Hilbert-Guided Block-Sparse Local Attention

Yunge Li

Lanyu Xu

102

08 Nov 2025

Attentive Convolution: Unifying the Expressivity of Self-Attention with Convolutional Efficiency

143

23 Oct 2025

Region-Aware Deformable Convolutions

Abolfazl Saheban Maleki

Maryam Imani

146

18 Sep 2025

Vision encoders should be image size agnostic and task driven

22 Aug 2025

Learning Spatial Decay for Vision Transformers

136

13 Aug 2025

RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization

147

13 Aug 2025

UniConvNet: Expanding Effective Receptive Field while Maintaining Asymptotically Gaussian Distribution for ConvNets of Any Scale

Yuhao Wang

Wei Xi

214

12 Aug 2025

Detection Transformers Under the Knife: A Neuroscience-Inspired Approach to Ablations

172

29 Jul 2025

Resolving Token-Space Gradient Conflicts: Token Space Manipulation for Transformer-Based Multi-Task Learning

Wooseong Jeong

Kuk-Jin Yoon

335

10 Jul 2025

AnchorFormer: Differentiable Anchor Attention for Efficient Vision TransformerPattern Recognition Letters (Pattern Recogn. Lett.), 2025

747

22 May 2025

Image Recognition with Online Lightweight Vision Transformer: A Survey

...

1.2K

06 May 2025

Crafting Query-Aware Selective Attention for Single Image Super-Resolution

309

09 Apr 2025

DFormerv2: Geometry Self-Attention for RGBD Semantic SegmentationComputer Vision and Pattern Recognition (CVPR), 2025

282

07 Apr 2025

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

Kumar Krishna Agrawal

207

16 Mar 2025

DCAT: Dual Cross-Attention Fusion for Disease Classification in Radiological Images with Uncertainty Estimation

Jutika Borah

H. Singh

MedIm

362

14 Mar 2025

MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation

616

11 Mar 2025

Adjoint sharding for very long context training of state space models

220

03 Jan 2025

STARFormer: A Novel Spatio-Temporal Aggregation Reorganization Transformer of FMRI for Brain Disorder DiagnosisNeural Networks (NN), 2024

277

31 Dec 2024

VMamba: Visual State Space ModelNeural Information Processing Systems (NeurIPS), 2024

1.1K

1,522

31 Dec 2024

Real Classification by Description: Extending CLIP's Limits of Part Attributes Recognition

365

18 Dec 2024

Bridging the Divide: Reconsidering Softmax and Linear AttentionNeural Information Processing Systems (NeurIPS), 2024

286

09 Dec 2024

Scaling Spike-driven Transformer with Efficient Spike Firing Approximation TrainingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

370

25 Nov 2024

Breaking the Low-Rank Dilemma of Linear AttentionComputer Vision and Pattern Recognition (CVPR), 2024

Qihang Fan

Huaibo Huang

Ran He

508

12 Nov 2024

Event-guided Low-light Video Semantic SegmentationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

Zhen Yao

Mooi Choo Choo Chuah

234

01 Nov 2024

COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered ScenesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

248

31 Oct 2024

PixelGaussian: Generalizable 3D Gaussian Reconstruction from Arbitrary Views

271

24 Oct 2024

MoH: Multi-Head Attention as Mixture-of-Head AttentionInternational Conference on Machine Learning (ICML), 2024

413

15 Oct 2024

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing AttentionAsian Conference on Computer Vision (ACCV), 2024

221

11 Oct 2024

Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention EngineeringAsian Conference on Computer Vision (ACCV), 2024

Yu-Chieh Lin

168

07 Oct 2024

CBAM-SwinT-BL: Small Rail Surface Defect Detection Method Based on Swin Transformer with Block Level CBAM EnhancementIEEE Access (IEEE Access), 2024

229

30 Sep 2024

Insight Any Instance: Promptable Instance Segmentation for Remote Sensing ImagesIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024

Xuexue Li

VLM ISeg

267

11 Sep 2024

A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships

Gracile Astlin Pereira

Muhammad Hussain

ViT

256

27 Aug 2024

Efficient Visual Representation Learning with Heat Conduction EquationInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

Zhemin Zhang

Xun Gong

DiffM 3DV

287

12 Aug 2024

Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation

245

24 Jul 2024

SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds

267

16 Jul 2024

iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency

Haruna Yunusa

Qin Shiyin

Abdulrahman Hamman Adama Chukkol

Isah Bello

A. Lawan

Isah Bello

285

10 Jul 2024

Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes

Qi Ma

Danda Pani Paudel

E. Konukoglu

Luc Van Gool

268

25 Jun 2024

Fusion of regional and sparse attention in Vision Transformers

224

13 Jun 2024

You Only Need Less Attention at Each Stage in Vision Transformers

289

01 Jun 2024

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

Liujuan Cao

Rongrong Ji

217

29 May 2024

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

308

28 May 2024

Demystify Mamba in Vision: A Linear Attention Perspective

Gao Huang

355

158

26 May 2024

Building Vision Models upon Heat Conduction

Yaowei Wang

274

26 May 2024

LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate

336

22 May 2024

Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens

370

22 May 2024

Vision Transformer with Sparse Scan Prior

386

22 May 2024

Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network

Min Hun Lee

AI4TS ViT FAtt

220

18 May 2024

Sparse Reconstruction of Optical Doppler Tomography with Alternative State Space Model and Attention

245

26 Apr 2024

Multi-Scale Representations by Varying Window Attention for Semantic Segmentation

Haotian Yan

Ming Wu

Chuang Zhang

296

25 Apr 2024