v1v2 (latest)

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

IEEE International Conference on Computer Vision (ICCV), 2021

25 March 2021

ArXiv (abs)PDF HTML HuggingFace (5 upvotes)Github (14835★)

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 8,530 papers shown

Understanding What Is Not Said:Referring Remote Sensing Image Segmentation with Scarce Expressions

26 Oct 2025

Simplifying Knowledge Transfer in Pretrained Models

Siddharth Jain

Shyamgopal Karthik

Vineet Gandhi

163

25 Oct 2025

Diffusion-Driven Two-Stage Active Learning for Low-Budget Semantic Segmentation

150

25 Oct 2025

Efficient Large-Deformation Medical Image Registration via Recurrent Dynamic Correlation

137

25 Oct 2025

Enpowering Your Pansharpening Models with Generalizability: Unified Distribution is All You Need

140

25 Oct 2025

FrameShield: Adversarially Robust Video Anomaly Detection

Mojtaba Nafez

Mobina Poulaei

Nikan Vasei

Bardia Soltani Moakhar

Mohammad Sabokrou

M. Rohban

AAML

176

24 Oct 2025

S3OD: Towards Generalizable Salient Object Detection with Synthetic Data

Orest Kupyn

Hirokatsu Kataoka

Christian Rupprecht

128

24 Oct 2025

Relieving the Over-Aggregating Effect in Graph Transformers

146

24 Oct 2025

MAGIC-Flow: Multiscale Adaptive Conditional Flows for Generation and Interpretable Classification

188

24 Oct 2025

Controllable-LPMoE: Adapting to Challenging Object Segmentation via Dynamic Local Priors from Mixture-of-Experts

123

24 Oct 2025

AutoOpt: A Dataset and a Unified Framework for Automating Optimization Problem Solving

Ankur Sinha

Shobhit Arora

Dhaval Pujara

129

24 Oct 2025

WaveSeg: Enhancing Segmentation Precision via High-Frequency Prior and Mamba-Driven Spectrum Decomposition

222

24 Oct 2025

LLMComp: A Language Modeling Paradigm for Error-Bounded Scientific Data Compression (Technical Report)

152

24 Oct 2025

Dynamic Semantic-Aware Correlation Modeling for UAV Tracking

24 Oct 2025

Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling

202

23 Oct 2025

Memory Constrained Dynamic Subnetwork Update for Transfer Learning

23 Oct 2025

Attentive Convolution: Unifying the Expressivity of Self-Attention with Convolutional Efficiency

156

23 Oct 2025

GranViT: A Fine-Grained Vision Model With Autoregressive Perception For MLLMs

...

163

23 Oct 2025

SutureBot: A Precision Framework & Benchmark For Autonomous End-to-End Suturing

182

23 Oct 2025

DARE: A Deformable Adaptive Regularization Estimator for Learning-Based Medical Image Registration

187

22 Oct 2025

SFGFusion: Surface Fitting Guided 3D Object Detection with 4D Radar and Camera Fusion

198

22 Oct 2025

Seabed-Net: A multi-task network for joint bathymetry estimation and seabed classification from remote sensing imagery in shallow waters

P. Agrafiotis

Tim Siebert

120

22 Oct 2025

Study of Training Dynamics for Memory-Constrained Fine-Tuning

106

22 Oct 2025

Guiding diffusion models to reconstruct flow fields from sparse data

213

22 Oct 2025

Matrix-Free Least Squares Solvers: Values, Gradients, and What to Do With Them

Hrittik Roy

Søren Hauberg

Nicholas Krämer

155

22 Oct 2025

FutrTrack: A Camera-LiDAR Fusion Transformer for 3D Multiple Object Tracking

Martha Teiko Teye

Ori Maoz

Matthias Rottmann

206

22 Oct 2025

AegisRF: Adversarial Perturbations Guided with Sensitivity for Protecting Intellectual Property of Neural Radiance Fields

166

22 Oct 2025

ProLAP: Probabilistic Language-Audio Pre-Training

139

21 Oct 2025

Integrated representational signatures strengthen specificity in brains and models

21 Oct 2025

Detection and Simulation of Urban Heat Islands Using a Fine-Tuned Geospatial Foundation Model for Microclimate Impact Prediction

Jannis Fleckenstein

David Kreismann

Tamara Rosemary Govindasamy

21 Oct 2025

A Renaissance of Explicit Motion Information Mining from Transformers for Action Recognition

213

21 Oct 2025

Learning Task-Agnostic Representations through Multi-Teacher Distillation

Philippe Formont

Maxime Darrin

Banafsheh Karimian

Jackie Chi Kit Cheung

165

21 Oct 2025

Δ

t-Mamba3D: A Time-Aware Spatio-Temporal State-Space Model for Breast Cancer Risk Prediction

198

21 Oct 2025

Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression

225

21 Oct 2025

ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters

234

21 Oct 2025

UltraGen: High-Resolution Video Generation with Hierarchical Attention

210

21 Oct 2025

SOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer Inference

138

20 Oct 2025

Accelerating Vision Transformers with Adaptive Patch Sizes

123

20 Oct 2025

Rethinking PCA Through Duality

Jan Quan

Johan A. K. Suykens

Panagiotis Patrinos

100

20 Oct 2025

M2H: Multi-Task Learning with Efficient Window-Based Cross-Task Attention for Monocular Spatial Perception

U.V.B.L Udugama

G. Vosselman

F. Nex

137

20 Oct 2025

ZACH-ViT: A Zero-Token Vision Transformer with ShuffleStrides Data Augmentation for Robust Lung Ultrasound Classification

Athanasios Angelakis

Amne Mousa

Micah L. A. Heldeweg

Laurens A. Biesheuvel

20 Oct 2025

Facial Expression-based Parkinson's Disease Severity Diagnosis via Feature Fusion and Adaptive Class Balancing

222

20 Oct 2025

ArmFormer: Lightweight Transformer Architecture for Real-Time Multi-Class Weapon Segmentation and Classification

161

19 Oct 2025

UKANFormer: Noise-Robust Semantic Segmentation for Coral Reef Mapping via a Kolmogorov-Arnold Network-Transformer Hybrid

191

19 Oct 2025

Beyond RGB: Leveraging Vision Transformers for Thermal Weapon Segmentation

Akhila Kambhatla

Ahmed R Khaled

ViT

19 Oct 2025

ReefNet: A Large scale, Taxonomically Enriched Dataset and Benchmark for Hard Coral Classification

148

19 Oct 2025

BARL: Bilateral Alignment in Representation and Label Spaces for Semi-Supervised Volumetric Medical Image Segmentation

Shujian Gao

Y Samuel Wang

Zekuan Yu

117

19 Oct 2025

Efficient High-Accuracy PDEs Solver with the Linear Attention Neural Operator

Ming Zhong

Zhenya Yan

AI4CE

120

19 Oct 2025

Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input

205

19 Oct 2025

EDVD-LLaMA: Explainable Deepfake Video Detection via Multimodal Large Language Model Reasoning

123

18 Oct 2025