ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
v1v2 (latest)

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

IEEE International Conference on Computer Vision (ICCV), 2021
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
    ViT
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)Github (14835★)

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 8,588 papers shown
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
Benjamin Kiefer
Lojze Žust
Jon Muhovič
Matej Kristan
J. Pers
...
Ashraf Saleem
Ching-Heng Cheng
Yu-Fan Lin
Tzu-Yu Lin
Chih-Chung Hsu
204
7
0
20 Jan 2025
MRI2Speech: Speech Synthesis from Articulatory Movements Recorded by Real-time MRI
MRI2Speech: Speech Synthesis from Articulatory Movements Recorded by Real-time MRIIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
N. Shah
Ayan Kashyap
Shirish S. Karande
Vineet Gandhi
246
1
0
20 Jan 2025
CSHNet: A Novel Information Asymmetric Image Translation Method
CSHNet: A Novel Information Asymmetric Image Translation Method
Xi Yang
Haoyuan Shi
Zihan Wang
N. Wang
Xinbo Gao
113
1
0
20 Jan 2025
Elucidating the Design Space of Dataset Condensation
Elucidating the Design Space of Dataset CondensationNeural Information Processing Systems (NeurIPS), 2024
Shitong Shao
Zikai Zhou
Huanran Chen
Zhiqiang Shen
DD
773
26
0
20 Jan 2025
Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks
Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural NetworksIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Michael Schwingshackl
Fabio Francisco Oberweger
Markus Murschitz
271
1
0
20 Jan 2025
Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution
Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution
Cuixin Yang
Rongkang Dong
Jun Xiao
Cong Zhang
Kin-Man Lam
Fei Zhou
Guoping Qiu
490
8
0
17 Jan 2025
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
MAMo: Leveraging Memory and Attention for Monocular Video Depth EstimationIEEE International Conference on Computer Vision (ICCV), 2023
R. Yasarla
H. Cai
Jisoo Jeong
Y. Shi
Risheek Garrepalli
Fatih Porikli
MDE
666
28
0
17 Jan 2025
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
FutureDepth: Learning to Predict the Future Improves Video Depth EstimationEuropean Conference on Computer Vision (ECCV), 2024
R. Yasarla
Manish Kumar Singh
Hong Cai
Yunxiao Shi
Jisoo Jeong
Yinhao Zhu
Shizhong Han
Risheek Garrepalli
Fatih Porikli
MDE
527
12
0
17 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
A Comprehensive Survey of Foundation Models in MedicineIEEE Reviews in Biomedical Engineering (RBME), 2024
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CELM&MAVLM
797
85
0
17 Jan 2025
Unified Face Matching and Physical-Digital Spoofing Attack Detection
Unified Face Matching and Physical-Digital Spoofing Attack Detection
Arun Kunwar
Ajita Rattani
CVBMAAML
308
0
0
17 Jan 2025
WMamba: Wavelet-based Mamba for Face Forgery Detection
WMamba: Wavelet-based Mamba for Face Forgery Detection
Siran Peng
Tianshuo Zhang
Li Gao
Xiangyu Zhu
Huatian Zhang
Kai Pang
Zhen Lei
Mamba
392
10
0
16 Jan 2025
NeurOp-Diff:Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion
NeurOp-Diff:Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion
Zihao Xu
Yuzhi Tang
Bowen Xu
Qingquan Li
DiffM
332
6
0
15 Jan 2025
Self Pre-training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging
Self Pre-training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical ImagingIEEE International Symposium on Biomedical Imaging (ISBI), 2025
Badhan Kumar Das
Gengyan Zhao
Han Liu
Thomas J. Re
Dorin Comaniciu
Eli Gibson
Andreas Maier
ViTMedIm
233
3
0
15 Jan 2025
Towards Lightweight Time Series Forecasting: a Patch-wise Transformer with Weak Data Enriching
Towards Lightweight Time Series Forecasting: a Patch-wise Transformer with Weak Data EnrichingIEEE International Conference on Data Engineering (ICDE), 2025
Meng Wang
Jintao Yang
Bin Yang
Hui Li
Tongxin Gong
Bo Yang
Jiangtao Cui
AI4TS
170
6
0
14 Jan 2025
AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual SegmentationIEEE transactions on multimedia (TMM), 2025
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Yifan Wang
Pingping Zhang
Lijun Wang
Huchuan Lu
MambaVOS
150
15
0
14 Jan 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Learning Motion and Temporal Cues for Unsupervised Video Object SegmentationIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024
Yunzhi Zhuge
Hongyu Gu
Lu Zhang
Jinqing Qi
Huchuan Lu
VOS
488
9
0
14 Jan 2025
Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Xianping Ma
Ziyao Wang
Yin Hu
Xiaokang Zhang
Man-On Pun
258
6
0
13 Jan 2025
Toward Realistic Camouflaged Object Detection: Benchmarks and Method
Toward Realistic Camouflaged Object Detection: Benchmarks and Method
Zhimeng Xin
Tianxu Wu
Shiming Chen
Shuo Ye
Zijing Xie
Yixiong Zou
Xinge You
Yufei Guo
205
0
0
13 Jan 2025
EdgeTAM: On-Device Track Anything Model
EdgeTAM: On-Device Track Anything ModelComputer Vision and Pattern Recognition (CVPR), 2025
Chong Zhou
Chenchen Zhu
Yunyang Xiong
Saksham Suri
Fanyi Xiao
...
Raghuraman Krishnamoorthi
Bo Dai
Chen Change Loy
Vikas Chandra
Bilge Soran
VLM
327
12
0
13 Jan 2025
MathReader : Text-to-Speech for Mathematical Documents
MathReader : Text-to-Speech for Mathematical DocumentsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Sieun Hyeon
Kyudan Jung
N. Kim
Hyun Gon Ryu
Jaeyoung Do
322
5
0
13 Jan 2025
Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities
Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities
Jialin Wu
Kaikai Pan
Yanjiao Chen
Jiangyi Deng
Shengyuan Pang
Wei Dong
ViTAAML
293
1
0
13 Jan 2025
Rethinking Knowledge in Distillation: An In-context Sample Retrieval Perspective
Rethinking Knowledge in Distillation: An In-context Sample Retrieval Perspective
Jinjing Zhu
Songze Li
Lin Wang
332
0
0
13 Jan 2025
Wavelet Meets Adam: Compressing Gradients for Memory-Efficient Training
Wavelet Meets Adam: Compressing Gradients for Memory-Efficient Training
Ziqing Wen
Ping Luo
Jun Wang
Xiaoge Deng
Jinping Zou
Kun Yuan
Tao Sun
Dongsheng Li
CLL
361
0
0
13 Jan 2025
CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection
CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object DetectionInformation Fusion (Inf. Fusion), 2025
Yongqian Li
Yang Yang
Zhen Lei
3DPC
278
6
0
11 Jan 2025
YO-CSA-T: A Real-time Badminton Tracking System Utilizing YOLO Based on Contextual and Spatial Attention
YO-CSA-T: A Real-time Badminton Tracking System Utilizing YOLO Based on Contextual and Spatial Attention
Yuan Lai
Zhiwei Shi
Chengxi Zhu
85
3
0
11 Jan 2025
Mix-QViT: Mixed-Precision Vision Transformer Quantization Driven by Layer Importance and Quantization Sensitivity
Mix-QViT: Mixed-Precision Vision Transformer Quantization Driven by Layer Importance and Quantization Sensitivity
Navin Ranjan
Andreas E. Savakis
MQ
220
6
0
10 Jan 2025
HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection
HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection
Anant Mehta
Bryant McArthur
Nagarjuna Kolloju
Zhengzhong Tu
300
6
0
10 Jan 2025
Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph
Hyper-3DG: Text-to-3D Gaussian Generation via HypergraphInternational Journal of Computer Vision (IJCV), 2024
Donglin Di
Jiahui Yang
Chaofan Luo
Zhou Xue
Wei Chen
Xun Yang
Yue Gao
3DGS
357
20
0
10 Jan 2025
BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response
BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response
Hongruixuan Chen
Jian Song
Olivier Dietrich
Clifford Broni-bediako
Weihao Xuan
...
Yimin Wei
J. Xia
Cuiling Lan
Konrad Schindler
Xiangwei Zhu
956
40
0
10 Jan 2025
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
Sheng Zhang
Yanbo Xu
Naoto Usuyama
Hanwen Xu
J. Bagga
...
Carlo Bifulco
M. Lungren
Tristan Naumann
Sheng Wang
Hoifung Poon
LM&MAMedIm
864
487
0
10 Jan 2025
MHAFF: Multi-Head Attention Feature Fusion of CNN and Transformer for Cattle Identification
MHAFF: Multi-Head Attention Feature Fusion of CNN and Transformer for Cattle IdentificationIEEE Transactions on AgriFood Electronics (TAE), 2025
Rabin Dulal
Lihong Zheng
M. A. Kabir
ViT
121
6
0
10 Jan 2025
MS-Temba: Multi-Scale Temporal Mamba for Understanding Long Untrimmed Videos
MS-Temba: Multi-Scale Temporal Mamba for Understanding Long Untrimmed Videos
Arkaprava Sinha
Monish Soundar Raj
Pu Wang
Ahmed Helmy
Srijan Das
Srijan Das
Mamba
619
5
0
10 Jan 2025
CAMs as Shapley Value-based Explainers
CAMs as Shapley Value-based ExplainersThe Visual Computer (Vis. Comput.), 2025
Huaiguang Cai
FAtt
239
4
0
09 Jan 2025
GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D Object Detection
GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D Object DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Yan Lu
Cheng Wang
Lei Yang
Tianzhu Zhang
Yating Liu
Qi Chu
Tong He
Yonghui Li
W. Ouyang
534
17
0
08 Jan 2025
AutoFish: Dataset and Benchmark for Fine-grained Analysis of Fish
AutoFish: Dataset and Benchmark for Fine-grained Analysis of Fish
Nejc Novak
Daniel Lehotský
Vasiliki Ismiroglou
Niels Madsen
T. Moeslund
Malte Pedersen
176
2
0
08 Jan 2025
Learning Informative Latent Representation for Quantum State Tomography
Learning Informative Latent Representation for Quantum State TomographyIEEE Transactions on Emerging Topics in Computational Intelligence (TETCI), 2023
Hailan Ma
Zhenhong Sun
Daoyi Dong
Dong Gong
318
4
0
08 Jan 2025
Flemme: A Flexible and Modular Learning Platform for Medical Images
Flemme: A Flexible and Modular Learning Platform for Medical ImagesIEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2024
Guoqing Zhang
Jingyun Yang
Yang Li
MedIm
289
2
0
08 Jan 2025
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Clinical Insights: A Comprehensive Review of Language Models in MedicinePLOS Digital Health (PDH), 2024
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
597
21
0
08 Jan 2025
Siamese-DETR for Generic Multi-Object Tracking
Siamese-DETR for Generic Multi-Object TrackingIEEE Transactions on Image Processing (IEEE TIP), 2023
Qiankun Liu
Yichen Li
Yuqi Jiang
Ying Fu
VOT
317
16
0
08 Jan 2025
BEN: Using Confidence-Guided Matting for Dichotomous Image Segmentation
BEN: Using Confidence-Guided Matting for Dichotomous Image Segmentation
Maxwell Meyer
Jack Spruyt
329
6
0
08 Jan 2025
Vision Transformer Neural Architecture Search for Out-of-Distribution Generalization: Benchmark and Insights
Vision Transformer Neural Architecture Search for Out-of-Distribution Generalization: Benchmark and InsightsNeural Information Processing Systems (NeurIPS), 2025
Sy-Tuyen Ho
Tuan Van Vo
Somayeh Ebrahimkhani
Ngai-Man Cheung
324
1
0
08 Jan 2025
MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention
MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attentionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Aadya Arora
Vinay Namboodiri
VLM
63
3
0
08 Jan 2025
NBBOX: Noisy Bounding Box Improves Remote Sensing Object Detection
NBBOX: Noisy Bounding Box Improves Remote Sensing Object DetectionIEEE Geoscience and Remote Sensing Letters (GRSL), 2024
Yechan Kim
SooYeon Kim
Moongu Jeon
ViT
394
6
0
08 Jan 2025
Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation
Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report GenerationIEEE Transactions on Medical Imaging (IEEE TMI), 2025
Xinyu Wang
Fuling Wang
Haowen Wang
Bo Jiang
Chuanfu Li
Longji Xu
Yonghong Tian
Jin Tang
MedIm
219
6
0
08 Jan 2025
Tighnari: Multi-modal Plant Species Prediction Based on Hierarchical Cross-Attention Using Graph-Based and Vision Backbone-Extracted Features
Tighnari: Multi-modal Plant Species Prediction Based on Hierarchical Cross-Attention Using Graph-Based and Vision Backbone-Extracted FeaturesConference and Labs of the Evaluation Forum (CLEF), 2025
Haixu Liu
Penghao Jiang
Zerui Tao
Muyan Wan
Qiuzhuang Sun
91
3
0
07 Jan 2025
PARF-Net: integrating pixel-wise adaptive receptive fields into hybrid Transformer-CNN network for medical image segmentation
Xu Ma
Mengsheng Chen
Junhui Zhang
Lijuan Song
Fang Du
Zhenhua Yu
ViTMedIm
336
0
0
06 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Jiayi Zhang
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
475
34
0
06 Jan 2025
Multilevel Semantic-Aware Model for AI-Generated Video Quality AssessmentIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Jiaze Li
Haoran Xu
Shiding Zhu
Junwei He
Haozhao Wang
VGenEGVMDiffM
154
2
0
06 Jan 2025
MObI: Multimodal Object Inpainting Using Diffusion Models
MObI: Multimodal Object Inpainting Using Diffusion Models
Alexandru Buburuzan
Anuj Sharma
John Redford
P. Dokania
Romain Mueller
DiffM
451
4
0
06 Jan 2025
Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
Haoyang Li
Xiaoyu Ren
Hongjiu Yu
Huiyu Duan
Kai Li
Ying Chen
Libo Wang
Xiongkuo Min
Guoquan Zheng
Xu Liu
CVBM
425
1
0
05 Jan 2025
Previous
123...373839...170171172
Next
Page 38 of 172
Pageof 172