v1v2 (latest)

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

IEEE International Conference on Computer Vision (ICCV), 2021

25 March 2021

ArXiv (abs)PDF HTML HuggingFace (5 upvotes)Github (14835★)

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 8,530 papers shown

Particle Trajectory Representation Learning with Masked Point Modeling

346

04 Feb 2025

MATCNN: Infrared and Visible Image Fusion Method Based on Multi-scale CNN with Attention TransformerIEEE Transactions on Instrumentation and Measurement (IEEE Trans. Instrum. Meas.), 2025

287

04 Feb 2025

SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset

Goodarz Mehr

A. Eskandarian

650

04 Feb 2025

A Framework for Double-Blind Federated Adaptation of Foundation Models

Nurbek Tastan

Karthik Nandakumar

FedML

322

03 Feb 2025

LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model AdaptationInternational Conference on Learning Representations (ICLR), 2025

1.1K

02 Feb 2025

BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as ExpertsInternational Conference on Learning Representations (ICLR), 2025

Divya J. Bajpai

M. Hanawal

426

02 Feb 2025

Contrastive Forward-Forward: A Training Algorithm of Vision TransformerNeural Networks (NN), 2025

Hossein Aghagolzadeh

Mehdi Ezoji

ViT

464

01 Feb 2025

Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic EncodingTowards Autonomous Robotic Systems (TAROS), 2025

313

01 Feb 2025

Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion

667

01 Feb 2025

LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2025

453

31 Jan 2025

Ground Awareness in Deep Learning for Large Outdoor Point Cloud SegmentationVISIGRAPP (VISIGRAPP), 2025

269

30 Jan 2025

VICCA: Visual Interpretation and Comprehension of Chest X-ray Anomalies in Generated Report Without Human FeedbackMachine Learning with Applications (MLWA), 2025

Sayeh Gholipour Picha

D. Chanti

A. Caplier

MedIm

338

29 Jan 2025

B-RIGHT: Benchmark Re-evaluation for Integrity in Generalized Human-Object Interaction Testing

187

28 Jan 2025

State-space models are accurate and efficient neural operators for dynamical systems

Zheyuan Hu

Nazanin Ahmadi Daryakenari

496

28 Jan 2025

MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation

302

28 Jan 2025

SpatioTemporal Learning for Human Pose Estimation in Sparsely-Labeled VideosAAAI Conference on Artificial Intelligence (AAAI), 2025

400

28 Jan 2025

V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection

358

28 Jan 2025

Collective Intelligence for 2D Push Manipulations with Mobile RobotsIEEE Robotics and Automation Letters (RA-L), 2022

447

28 Jan 2025

Radiologist-in-the-Loop Self-Training for Generalizable CT Metal Artifact ReductionIEEE Transactions on Medical Imaging (IEEE TMI), 2025

220

28 Jan 2025

MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis

422

28 Jan 2025

Prion-ViT: Prions-Inspired Vision Transformers for Temperature prediction with Specklegrams

310

28 Jan 2025

Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection

Heqian Qiu

Hongliang Li

ObjD VLM

1.0K

28 Jan 2025

iFormer: Integrating ConvNet and Transformer for Mobile ApplicationInternational Conference on Learning Representations (ICLR), 2025

Chuanyang Zheng

ViT

397

26 Jan 2025

Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised DataInternational Conference on Learning Representations (ICLR), 2025

435

25 Jan 2025

PolaFormer: Polarity-aware Linear Attention for Vision TransformersInternational Conference on Learning Representations (ICLR), 2025

1.1K

25 Jan 2025

Rethinking Encoder-Decoder Flow Through Shared StructuresIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

251

24 Jan 2025

Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation ModelsIEEE International Symposium on Biomedical Imaging (ISBI), 2025

190

23 Jan 2025

FreEformer: Frequency Enhanced Transformer for Multivariate Time Series ForecastingInternational Joint Conference on Artificial Intelligence (IJCAI), 2025

209

23 Jan 2025

Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi

383

22 Jan 2025

Parallel Sequence Modeling via Generalized Spatial Propagation NetworkComputer Vision and Pattern Recognition (CVPR), 2025

847

21 Jan 2025

Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2Engineering Reports (ER), 2025

Md. Rakibul Islam

Md. Zahid Hossain

Mustofa Ahmed

Most. Sharmin Sultana Samu

LM&MA MedIm

239

21 Jan 2025

Towards Accurate Unified Anomaly SegmentationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

367

21 Jan 2025

A margin-based replacement for cross-entropy loss

Michael W. Spratling

Heiko H. Schütt

319

21 Jan 2025

Scalable Whole Slide Image Representation Using K-Mean Clustering and Fisher Vector AggregationIEEE International Symposium on Biomedical Imaging (ISBI), 2025

163

21 Jan 2025

Comparative Analysis of Pre-trained Deep Learning Models and DINOv2 for Cushing's Syndrome Diagnosis in Facial AnalysisAnnual International Computer Software and Applications Conference (COMPSAC), 2025

21 Jan 2025

TFLOP: Table Structure Recognition Framework with Layout Pointer MechanismInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

Minsoo Khang

Teakgyu Hong

LMTD

314

21 Jan 2025

UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model

391

21 Jan 2025

A Survey on Memory-Efficient Transformer-Based Model Training in AI for Science

383

21 Jan 2025

A generalizable 3D framework and model for self-supervised learning in medical imaging

343

20 Jan 2025

Subjective and Objective Quality Assessment of Non-Uniformly Distorted Omnidirectional ImagesIEEE transactions on multimedia (TMM), 2025

181

20 Jan 2025

ACE: Anatomically Consistent Embeddings in Composition and DecompositionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

358

20 Jan 2025

3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results

...

199

20 Jan 2025

Elucidating the Design Space of Dataset CondensationNeural Information Processing Systems (NeurIPS), 2024

755

20 Jan 2025

Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural NetworksIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

Michael Schwingshackl

Fabio Francisco Oberweger

Markus Murschitz

266

20 Jan 2025

MRI2Speech: Speech Synthesis from Articulatory Movements Recorded by Real-time MRIIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

240

20 Jan 2025

A Comprehensive Survey of Foundation Models in MedicineIEEE Reviews in Biomedical Engineering (RBME), 2024

780

17 Jan 2025

FutureDepth: Learning to Predict the Future Improves Video Depth EstimationEuropean Conference on Computer Vision (ECCV), 2024

516

17 Jan 2025

Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution

457

17 Jan 2025

MAMo: Leveraging Memory and Attention for Monocular Video Depth EstimationIEEE International Conference on Computer Vision (ICCV), 2023

601

17 Jan 2025

Unified Face Matching and Physical-Digital Spoofing Attack Detection

Arun Kunwar

Ajita Rattani

CVBM AAML

289

17 Jan 2025