v1v2 (latest)

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021

ArXiv (abs)PDF HTML Github (14834★)

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 932 papers shown

Mahalanobis++: Improving OOD Detection via Feature Normalization

Maximilian Mueller

Matthias Hein

OODD

342

23 May 2025

RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge InjectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

437

20 May 2025

EGFormer: Towards Efficient and Generalizable Multimodal Semantic Segmentation

195

20 May 2025

Mamba-Adaptor: State Space Model Adaptor for Visual RecognitionComputer Vision and Pattern Recognition (CVPR), 2025

362

19 May 2025

TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series

377

13 May 2025

SynID: Passport Synthetic Dataset for Presentation Attack Detection

Juan E. Tapia

Fabian Stockhardt

Lázaro J. González Soler

Christoph Busch

298

12 May 2025

Adapting a Segmentation Foundation Model for Medical Image Classification

179

09 May 2025

ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling

...

266

07 May 2025

Stow: Robotic Packing of Items into Fabric Pods

...

228

07 May 2025

Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation

226

06 May 2025

SCOPE-MRI: Bankart Lesion Detection as a Case Study in Data Curation and Deep Learning for Challenging Diagnoses

215

29 Apr 2025

Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR

275

28 Apr 2025

Examining the Impact of Optical Aberrations to Image Classification and Object Detection ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

Patrick Müller

Alexander Braun

Margret Keuper

278

25 Apr 2025

High-Quality Cloud-Free Optical Image Synthesis Using Multi-Temporal SAR and Contaminated Optical Data

Chenxi Duan

237

23 Apr 2025

LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers

M. Chowdhury

Md Rifat Ur Rahman

Akil Ahmad Taki

225

19 Apr 2025

BeetleVerse: A Study on Taxonomic Classification of Ground Beetles

186

18 Apr 2025

Learning from Noisy Pseudo-labels for All-Weather Land Cover Mapping

239

18 Apr 2025

Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design

234

18 Apr 2025

Perception Encoder: The best visual embeddings are not at the output of the network

Daniel Bolya

Po-Yao (Bernie) Huang

...

Christoph Feichtenhofer

ObjD VOS

666

107

17 Apr 2025

NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results

...

319

17 Apr 2025

SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling

287

17 Apr 2025

Plain Transformers Can be Powerful Graph Learners

343

17 Apr 2025

Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image

338

16 Apr 2025

Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large ImagesComputer Vision and Pattern Recognition (CVPR), 2025

225

13 Apr 2025

SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification

328

13 Apr 2025

Hyperlocal disaster damage assessment using bi-temporal street-view imagery and pre-trained vision modelsComputers, Environment and Urban Systems (CEUS), 2025

162

12 Apr 2025

Mixture of Group Experts for Learning Invariant Representations

359

12 Apr 2025

Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases

402

10 Apr 2025

Audio-visual Event Localization on Portrait Mode Short Videos

306

09 Apr 2025

A Robust Real-Time Lane Detection Method with Fog-Enhanced Feature Fusion for Foggy Conditions

617

08 Apr 2025

EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection

Muhammad Ahmed Ullah Khan

Abdul Hannan Khan

Andreas Dengel

236

05 Apr 2025

Spline-based TransformersEuropean Conference on Computer Vision (ECCV), 2025

362

03 Apr 2025

Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results

386

03 Apr 2025

FLAMES: A Hybrid Spiking-State Space Model for Adaptive Memory Retention in Event-Based Learning

Biswadeep Chakraborty

Saibal Mukhopadhyay

453

02 Apr 2025

rPPG-SysDiaGAN: Systolic-Diastolic Feature Localization in rPPG Using Generative Adversarial Network with Multi-Domain Discriminator

Banafsheh Adami

Nima Karimian

231

01 Apr 2025

GRU-AUNet: A Domain Adaptation Framework for Contactless Fingerprint Presentation Attack DetectionSilicon Valley Cybersecurity Conference (SVCC), 2025

Banafsheh Adami

Nima Karimian

195

01 Apr 2025

LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification

430

31 Mar 2025

Efficient Token Compression for Vision Transformer with Spatial Information Preserved

359

30 Mar 2025

FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling

304

25 Mar 2025

Data-driven Mesoscale Weather Forecasting Combining Swin-Unet and Diffusion Models

Yuta Hirabayashi

Daisuke Matsuoka

DiffM

188

25 Mar 2025

CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge DistillationComputer Vision and Pattern Recognition (CVPR), 2025

273

23 Mar 2025

Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration

348

22 Mar 2025

Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?

449

21 Mar 2025

From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed LearningComputer Vision and Pattern Recognition (CVPR), 2025

391

20 Mar 2025

LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding

Amirhossein Kazerouni

Soroush Mehraban

Michael Brudno

Babak Taati

280

19 Mar 2025

Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition

331

17 Mar 2025

CLIP-Free, Label-Free, Zero-Shot Concept Bottleneck Models

221

14 Mar 2025

MEET: A Million-Scale Dataset for Fine-Grained Geospatial Scene Classification with Zoom-Free Remote Sensing ImageryIEEE/CAA Journal of Automatica Sinica (IEEE/CAA J. Autom. Sin.), 2025

...

216

14 Mar 2025

HeightFormer: Learning Height Prediction in Voxel Features for Roadside Vision Centric 3D Object Detection via Transformer

405

13 Mar 2025

Rethinking Two-Stage Referring-by-Tracking in Referring Multi-Object Tracking: Make it Strong Again

382

10 Mar 2025