ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
v1v2 (latest)

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

IEEE International Conference on Computer Vision (ICCV), 2021
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
    ViT
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)Github (14835★)

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 8,523 papers shown
Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective
Rethinking Decoupled Knowledge Distillation: A Predictive Distribution PerspectiveIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2025
Bowen Zheng
Ran Cheng
106
1
0
04 Dec 2025
Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection
Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object DetectionIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2025
Xiangyi Gao
Danpei Zhao
Bo Yuan
Wentao Li
75
0
0
04 Dec 2025
Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects
Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent ObjectsIEEE International Conference on Multimedia and Expo (ICME), 2025
Xianghui Fan
Zhaoyu Chen
Mengyang Pan
Anping Deng
Hang Yang
109
0
0
04 Dec 2025
GeoPE:A Unified Geometric Positional Embedding for Structured Tensors
GeoPE:A Unified Geometric Positional Embedding for Structured Tensors
Yupu Yao
Bowen Yang
MDE
294
0
0
04 Dec 2025
Shift-Window Meets Dual Attention: A Multi-Model Architecture for Specular Highlight Removal
Shift-Window Meets Dual Attention: A Multi-Model Architecture for Specular Highlight Removal
Tianci Huo
Lingfeng Qi
Yuhan Chen
Qihong Xue
Jinyuan Shao
Hai Yu
Jie Li
Zhanhua Zhang
Guofa Li
89
0
0
04 Dec 2025
Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks
Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks
Leonid Pogorelyuk
Niels Bracher
Aaron Verkleeren
Lars Kühmichel
Stefan T. Radev
SSL
281
0
0
04 Dec 2025
HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation
HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation
Fuchen Zheng
Xinyi Chen
Weixuan Li
Quanjun Li
J. Zhou
Xiaojiao Guo
Xuhang Chen
Chi-Man Pun
Shoujun Zhou
MedIm
264
0
0
03 Dec 2025
Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching
Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching
Wei Chee Yew
Hailun Xu
Sanjay Saha
Xiaotian Fan
Hiok Hian Ong
David Yuchen Wang
Kanchan Sarkar
Zhenheng Yang
Danhui Guan
80
0
0
03 Dec 2025
MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms
MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention MechanismsConference on Multimedia Modeling (MMM), 2025
Jiahao Zhang
Xiao Zhao
Guangyu Gao
82
0
0
03 Dec 2025
Dual Cross-Attention Siamese Transformer for Rectal Tumor Regrowth Assessment in Watch-and-Wait Endoscopy
Dual Cross-Attention Siamese Transformer for Rectal Tumor Regrowth Assessment in Watch-and-Wait Endoscopy
J. T. Gomez
Despoina Kanata
Aneesh Rangnekar
Christina Lee
J. Garcia-Aguilar
Joshua Jesse Smith
Harini Veeraraghavan
35
0
0
03 Dec 2025
DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision
DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision
Jiashu Liao
Pietro Liò
Marc de Kamps
Duygu Sarikaya
104
0
0
03 Dec 2025
ESACT: An End-to-End Sparse Accelerator for Compute-Intensive Transformers via Local Similarity
ESACT: An End-to-End Sparse Accelerator for Compute-Intensive Transformers via Local Similarity
Hongxiang Liu
Zhifang Deng
Tong Pu
Shengli Lu
162
0
0
02 Dec 2025
BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection
BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection
Guowen Zhang
Chenhang He
Liyi Chen
Lei Zhang
69
0
0
02 Dec 2025
Layout Anything: One Transformer for Universal Room Layout Estimation
Layout Anything: One Transformer for Universal Room Layout Estimation
Md Sohag Mia
Muhammad Abdullah Adnan
ViT3DV
132
0
0
02 Dec 2025
Unrolled Networks are Conditional Probability Flows in MRI Reconstruction
Unrolled Networks are Conditional Probability Flows in MRI Reconstruction
Kehan Qi
Saumya Gupta
Qingqiao Hu
Weimin Lyu
Chao Chen
MedIm
269
0
0
02 Dec 2025
DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions
DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions
Yifan Zhou
Takehiko Ohkawa
Guwenxiao Zhou
Kanoko Goto
Takumi Hirose
Yusuke Sekikawa
Nakamasa Inoue
3DHMamba
444
0
0
02 Dec 2025
GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection
GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection
Md Sohag Mia
Md Nahid Hasan
Tawhid Ahmed
Muhammad Abdullah Adnan
3DPCViT
216
0
0
02 Dec 2025
Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources
Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources
Phuc Pham
Nhu Pham
Ngoc Quoc Ly
VLM
170
0
0
02 Dec 2025
Data-Centric Visual Development for Self-Driving Labs
Anbang Liu
Guanzhong Hu
Jiayi Wang
Ping Guo
Han Liu
139
0
0
01 Dec 2025
ResDiT: Evoking the Intrinsic Resolution Scalability in Diffusion Transformers
ResDiT: Evoking the Intrinsic Resolution Scalability in Diffusion Transformers
Yiyang Ma
Feng Zhou
Xuedan Yin
Pu Cao
Yonghao Dang
Jianqin Yin
100
0
0
01 Dec 2025
OpenREAD: Reinforced Open-Ended Reasoning for End-to-End Autonomous Driving with LLM-as-Critic
OpenREAD: Reinforced Open-Ended Reasoning for End-to-End Autonomous Driving with LLM-as-Critic
Songyan Zhang
Wenhui Huang
Zhan Chen
Chua Jiahao Collister
Qihang Huang
Chen Lv
OffRLLRM
211
2
0
01 Dec 2025
ELVIS: Enhance Low-Light for Video Instance Segmentation in the Dark
Joanne Lin
Ruirui Lin
Yini Li
David Bull
Nantheera Anantrasirichai
166
0
0
01 Dec 2025
Robust Rigid and Non-Rigid Medical Image Registration Using Learnable Edge Kernels
Ahsan Raza Siyal
Markus Haltmeier
R. Steiger
Malik Galijasevic
E. Gizewski
A. E. Grams
129
0
0
01 Dec 2025
Toward Content-based Indexing and Retrieval of Head and Neck CT with Abscess Segmentation
Toward Content-based Indexing and Retrieval of Head and Neck CT with Abscess Segmentation
Thao Thi Phuong Dao
Tan-Cong Nguyen
Trong-Le Do
Truong Hoang Viet
Nguyen Chi Thanh
...
T. Le
Vo Thanh Toan
T. Nguyen
Minh-Triet Tran
Thanh Dinh Le
112
0
0
01 Dec 2025
PointNet4D: A Lightweight 4D Point Cloud Video Backbone for Online and Offline Perception in Robotic Applications
Yunze Liu
Zifan Wang
Peiran Wu
Jiayang Ao
3DPC
153
0
0
01 Dec 2025
SGDiff: Scene Graph Guided Diffusion Model for Image Collaborative SegCaptioning
SGDiff: Scene Graph Guided Diffusion Model for Image Collaborative SegCaptioningAAAI Conference on Artificial Intelligence (AAAI), 2025
Xu Zhang
Jin Yuan
Hanwang Zhang
Guojin Zhong
Yongsheng Zang
Jiacheng Lin
Zhiyong Li
DiffMVLM
136
1
0
01 Dec 2025
nnMobileNet++: Towards Efficient Hybrid Networks for Retinal Image Analysis
nnMobileNet++: Towards Efficient Hybrid Networks for Retinal Image Analysis
Xin Li
Wenhui Zhu
Xuanzhao Dong
Hao Wang
Yujian Xiong
Oana Dumitrascu
Yalin Wang
MedIm
157
0
0
01 Dec 2025
ViT$^3$: Unlocking Test-Time Training in Vision
ViT3^33: Unlocking Test-Time Training in Vision
Dongchen Han
Y. Li
Tianyu Li
Z. Cao
Ziming Wang
Jun Song
Yu Cheng
Bo Zheng
Gao Huang
ViT
85
0
0
01 Dec 2025
Disentangling Progress in Medical Image Registration: Beyond Trend-Driven Architectures towards Domain-Specific Strategies
Disentangling Progress in Medical Image Registration: Beyond Trend-Driven Architectures towards Domain-Specific Strategies
Bailiang Jian
J. Pan
Rohit Jena
Morteza Ghahremani
Hongwei Bran Li
Daniel Rueckert
Christian Wachinger
Benedikt Wiestler
OOD
196
1
0
01 Dec 2025
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
Zipeng Wang
Dan Xu
ViT
121
1
0
01 Dec 2025
Joint Multi-scale Gated Transformer and Prior-guided Convolutional Network for Learned Image Compression
Joint Multi-scale Gated Transformer and Prior-guided Convolutional Network for Learned Image Compression
Zhengxin Chen
Xiaohai He
Tingrong Zhang
Shuhua Xiong
Chao Ren
ViT
65
0
0
30 Nov 2025
LAHNet: Local Attentive Hashing Network for Point Cloud Registration
LAHNet: Local Attentive Hashing Network for Point Cloud Registration
Wentao Qu
Xiaoshui Huang
Liang Xiao
3DPC
132
0
0
30 Nov 2025
OmniFD: A Unified Model for Versatile Face Forgery Detection
OmniFD: A Unified Model for Versatile Face Forgery Detection
Haotian Liu
Haoyu Chen
Chenhui Pan
You Hu
Guoying Zhao
Xiaobai Li
CVBM
322
0
0
30 Nov 2025
Parameter Reduction Improves Vision Transformers: A Comparative Study of Sharing and Width Reduction
Anantha Padmanaban Krishna Kumar
ViT
70
0
0
30 Nov 2025
SceneProp: Combining Neural Network and Markov Random Field for Scene-Graph Grounding
Keita Otani
Tatsuya Harada
80
0
0
30 Nov 2025
Cross-Domain Federated Semantic Communication with Global Representation Alignment and Domain-Aware Aggregation
Cross-Domain Federated Semantic Communication with Global Representation Alignment and Domain-Aware Aggregation
Loc X. Nguyen
J. Yoon
Huy Q. Le
Yu Qiao
Avi Deb Raha
Eui-nam Huh
Walid Saad
Dusit Niyato
Zhu Han
Choong Seon Hong
FedML
166
0
0
30 Nov 2025
Silhouette-based Gait Foundation Model
Dingqiang Ye
Chao Fan
Kartik Narayan
Bingzhe Wu
Chengwen Luo
Jianqiang Li
Vishal M. Patel
68
0
0
30 Nov 2025
Structured Context Learning for Generic Event Boundary Detection
Structured Context Learning for Generic Event Boundary Detection
Xin Gu
Congcong Li
Xinyao Wang
Dexiang Hong
Libo Zhang
Tiejian Luo
Longyin Wen
Heng Fan
86
0
0
29 Nov 2025
HIMOSA: Efficient Remote Sensing Image Super-Resolution with Hierarchical Mixture of Sparse Attention
HIMOSA: Efficient Remote Sensing Image Super-Resolution with Hierarchical Mixture of Sparse Attention
Y. Liu
Yi Wan
Xinyi Liu
Qiong Wu
Panwang Xia
Xuejun Huang
Y. Zhang
91
0
0
29 Nov 2025
Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation
Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation
Xiao Cui
Yulei Qin
Wengang Zhou
Hongsheng Li
Houqiang Li
DDOT
237
1
0
29 Nov 2025
UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes
UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes
Shuo Ni
Di Wang
He Chen
Haonan Guo
Ning Zhang
Jing Zhang
AI4TSVLM
224
1
0
28 Nov 2025
Learning to Predict Aboveground Biomass from RGB Images with 3D Synthetic Scenes
Learning to Predict Aboveground Biomass from RGB Images with 3D Synthetic Scenes
Silvia Zuffi
123
0
0
28 Nov 2025
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
Guang Liang
Jie Shao
Ningyuan Tang
Xinyao Liu
Jianxin Wu
MQ
192
0
0
28 Nov 2025
Transformer-Driven Triple Fusion Framework for Enhanced Multimodal Author Intent Classification in Low-Resource Bangla
Transformer-Driven Triple Fusion Framework for Enhanced Multimodal Author Intent Classification in Low-Resource Bangla
Ariful Islam
Tanvir Mahmud
Md Rifat Hossen
ViT
181
0
0
28 Nov 2025
Stable-Drift: A Patient-Aware Latent Drift Replay Method for Stabilizing Representations in Continual Learning
Stable-Drift: A Patient-Aware Latent Drift Replay Method for Stabilizing Representations in Continual Learning
P. Theofilou
Anuhya Thota
Stefanos D. Kollias
Mamatha Thota
MedIm
314
0
0
27 Nov 2025
UMind-VL: A Generalist Ultrasound Vision-Language Model for Unified Grounded Perception and Comprehensive Interpretation
UMind-VL: A Generalist Ultrasound Vision-Language Model for Unified Grounded Perception and Comprehensive Interpretation
Dengbo Chen
Ziwei Zhao
Kexin Zhang
Shishuang Zhao
J. Hou
...
AnLan Sun
Fei Gao
Jia Ding
Y. Liu
Dong Wang
VLM
125
0
0
27 Nov 2025
Small Object Detection for Birds with Swin Transformer
Small Object Detection for Birds with Swin Transformer
Da Huo
Marc A. Kastner
Tingwei Liu
Yasutomo Kawanishi
Takatsugu Hirayama
Takahiro Komamizu
Ichiro Ide
ObjDViT
154
10
0
27 Nov 2025
IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer
IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer
Bo Chen
Tao Liu
Qi Chen
Xie Chen
Zilong Zheng
VGen
100
0
0
27 Nov 2025
Hard Spatial Gating for Precision-Driven Brain Metastasis Segmentation: Addressing the Over-Segmentation Paradox in Deep Attention Networks
Hard Spatial Gating for Precision-Driven Brain Metastasis Segmentation: Addressing the Over-Segmentation Paradox in Deep Attention Networks
Rowzatul Zannath Prerona
101
0
0
27 Nov 2025
Rethinking Cross-Generator Image Forgery Detection through DINOv3
Rethinking Cross-Generator Image Forgery Detection through DINOv3
Zhenglin Huang
Jason Li
Haiquan Wen
Tianxiao Li
Xi Yang
Lu Qi
Bei Peng
Xiaowei Huang
Ming-Hsuan Yang
Guangliang Cheng
89
0
0
27 Nov 2025
1234...169170171
Next
Page 1 of 171
Pageof 171