ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
v1v2 (latest)

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

IEEE International Conference on Computer Vision (ICCV), 2021
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
    ViT
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)Github (14835★)

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 8,530 papers shown
StyTr$^2$: Image Style Transfer with Transformers
StyTr2^22: Image Style Transfer with TransformersComputer Vision and Pattern Recognition (CVPR), 2021
Yingying Deng
Fan Tang
Weiming Dong
Chongyang Ma
Xingjia Pan
Lei Wang
Changsheng Xu
ViT
398
369
0
30 May 2021
TransMatcher: Deep Image Matching Through Transformers for Generalizable
  Person Re-identification
TransMatcher: Deep Image Matching Through Transformers for Generalizable Person Re-identificationNeural Information Processing Systems (NeurIPS), 2021
Tianran Ouyang
Ling Shao
ViT
260
72
0
30 May 2021
Gaze Estimation using Transformer
Gaze Estimation using TransformerInternational Conference on Pattern Recognition (ICPR), 2021
Yihua Cheng
Feng Lu
ViT
226
137
0
30 May 2021
Deep Learning on Monocular Object Pose Detection and Tracking: A
  Comprehensive Overview
Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive OverviewACM Computing Surveys (CSUR), 2021
Zhaoxin Fan
Yazhi Zhu
Yulin He
Qi Sun
Hongyan Liu
Jun He
376
101
0
29 May 2021
Less is More: Pay Less Attention in Vision Transformers
Less is More: Pay Less Attention in Vision TransformersAAAI Conference on Artificial Intelligence (AAAI), 2021
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
349
104
0
29 May 2021
Augmenting Anchors by the Detector Itself
Augmenting Anchors by the Detector ItselfInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Xiaopei Wan
Shengjie Chen
Yujiu Yang
Zhenhua Guo
ObjD
108
0
0
28 May 2021
PTNet: A High-Resolution Infant MRI Synthesizer Based on Transformer
PTNet: A High-Resolution Infant MRI Synthesizer Based on Transformer
Xuzhe Zhang
Xinzi He
Jia Guo
Nabil Ettehadi
Natalie Aw
David P. Semanek
J. Posner
Andrew F. Laine
Yun Wang
ViTMedIm
126
33
0
28 May 2021
ResT: An Efficient Transformer for Visual Recognition
ResT: An Efficient Transformer for Visual RecognitionNeural Information Processing Systems (NeurIPS), 2021
Qing-Long Zhang
Yubin Yang
ViT
410
282
0
28 May 2021
KVT: k-NN Attention for Boosting Vision Transformers
KVT: k-NN Attention for Boosting Vision TransformersEuropean Conference on Computer Vision (ECCV), 2021
Pichao Wang
Qingsong Wen
F. Wang
Ming Lin
Shuning Chang
Hao Li
Rong Jin
ViT
263
130
0
28 May 2021
Recent advances and clinical applications of deep learning in medical
  image analysis
Recent advances and clinical applications of deep learning in medical image analysis
Xuxin Chen
Ximing Wang
Kecheng Zhang
K. Fung
T. Thai
K. Moore
Robert S. Mannel
Hong Liu
B. Zheng
Y. Qiu
OOD
439
821
0
27 May 2021
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and
  Interpretable Visual Understanding
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual UnderstandingAAAI Conference on Artificial Intelligence (AAAI), 2021
Zizhao Zhang
Han Zhang
Long Zhao
Ting Chen
Sercan O. Arik
Tomas Pfister
ViT
357
208
0
26 May 2021
Oriented RepPoints for Aerial Object Detection
Oriented RepPoints for Aerial Object DetectionComputer Vision and Pattern Recognition (CVPR), 2021
Wentong Li
Yijie Chen
Kaixuan Hu
Jianke Zhu
3DPC
547
429
0
24 May 2021
FineAction: A Fine-Grained Video Dataset for Temporal Action
  Localization
FineAction: A Fine-Grained Video Dataset for Temporal Action LocalizationIEEE Transactions on Image Processing (TIP), 2021
Lu Dong
Limin Wang
Yali Wang
Xiao Ma
Yu Qiao
295
78
0
24 May 2021
Intriguing Properties of Vision Transformers
Intriguing Properties of Vision TransformersNeural Information Processing Systems (NeurIPS), 2021
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Munawar Hayat
Fahad Shahbaz Khan
Ming-Hsuan Yang
ViT
687
755
0
21 May 2021
Content-Augmented Feature Pyramid Network with Light Linear Spatial
  Transformers for Object Detection
Content-Augmented Feature Pyramid Network with Light Linear Spatial Transformers for Object DetectionIET Image Processing (IET Image Process.), 2021
Yongxiang Gu
Xiaolin Qin
Yuncong Peng
Lu Li
ViT
233
8
0
20 May 2021
I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text
  Recognition
I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Chuhui Xue
Jiaxing Huang
Wenqing Zhang
Shijian Lu
Changhu Wang
S. Bai
336
21
0
18 May 2021
Pay Attention to MLPs
Pay Attention to MLPsNeural Information Processing Systems (NeurIPS), 2021
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
AI4CE
624
807
0
17 May 2021
Towards Robust Vision Transformer
Towards Robust Vision TransformerComputer Vision and Pattern Recognition (CVPR), 2021
Xiaofeng Mao
Gege Qi
YueFeng Chen
Xiaodan Li
Ranjie Duan
Shaokai Ye
Yuan He
Hui Xue
ViT
466
234
0
17 May 2021
Vision Transformers are Robust Learners
Vision Transformers are Robust LearnersAAAI Conference on Artificial Intelligence (AAAI), 2021
Sayak Paul
Pin-Yu Chen
ViT
380
356
0
17 May 2021
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial
  Transformers
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial TransformersIEEE Transactions on Medical Imaging (IEEE TMI), 2021
Yilmaz Korkmaz
S. Dar
Mahmut Yurt
Muzaffer Özbey
Tolga Çukur
ViTMedIm
356
228
0
15 May 2021
Segmenter: Transformer for Semantic Segmentation
Segmenter: Transformer for Semantic SegmentationIEEE International Conference on Computer Vision (ICCV), 2021
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
776
1,803
0
12 May 2021
Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation
Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation
Hu Cao
Yueyue Wang
Jieneng Chen
Dongsheng Jiang
Xiaopeng Zhang
Qi Tian
Manning Wang
ViTMedIm
411
4,342
0
12 May 2021
A Large-Scale Benchmark for Food Image Segmentation
A Large-Scale Benchmark for Food Image SegmentationACM Multimedia (ACM MM), 2021
Xiongwei Wu
Xin Fu
Ying Liu
Ee-Peng Lim
Guosheng Lin
Qianru Sun
VLM
177
102
0
12 May 2021
Hierarchical RNNs-Based Transformers MADDPG for Mixed
  Cooperative-Competitive Environments
Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive EnvironmentsJournal of Intelligent & Fuzzy Systems (JIFS), 2021
Xiaolong Wei
Lifang Yang
Xianglin Huang
Gang Cao
Zhulin Tao
Zhengyang Du
Jing An
201
7
0
11 May 2021
Self-Supervised Learning with Swin Transformers
Self-Supervised Learning with Swin Transformers
Zhenda Xie
Yutong Lin
Zhuliang Yao
Zheng Zhang
Jingdong Sun
Yue Cao
Han Hu
ViT
304
205
0
10 May 2021
You Only Learn One Representation: Unified Network for Multiple Tasks
You Only Learn One Representation: Unified Network for Multiple TasksJournal of information science and engineering (JISE), 2021
Chien-Yao Wang
I-Hau Yeh
Hongpeng Liao
SSLFedML
367
565
0
10 May 2021
MOTR: End-to-End Multiple-Object Tracking with Transformer
MOTR: End-to-End Multiple-Object Tracking with TransformerEuropean Conference on Computer Vision (ECCV), 2021
Fangao Zeng
Bin Dong
Cheng Chen
Tiancai Wang
Xinming Zhang
Yichen Wei
VOT
589
698
0
07 May 2021
A State-of-the-art Survey of Object Detection Techniques in
  Microorganism Image Analysis: From Classical Methods to Deep Learning
  Approaches
A State-of-the-art Survey of Object Detection Techniques in Microorganism Image Analysis: From Classical Methods to Deep Learning ApproachesArtificial Intelligence Review (AIR), 2021
Pingli Ma
Chen Li
M. Rahaman
Yudong Yao
Jiawei Zhang
Shuojia Zou
Xin Zhao
M. Grzegorzek
198
81
0
07 May 2021
Do You Even Need Attention? A Stack of Feed-Forward Layers Does
  Surprisingly Well on ImageNet
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
Luke Melas-Kyriazi
ViT
122
116
0
06 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for
  Visual Tasks
Beyond Self-attention: External Attention using Two Linear Layers for Visual TasksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
232
640
0
05 May 2021
Attention for Image Registration (AiR): an unsupervised Transformer
  approach
Attention for Image Registration (AiR): an unsupervised Transformer approach
Zihao Wang
H. Delingette
ViTMedIm
118
8
0
05 May 2021
Instances as Queries
Instances as QueriesIEEE International Conference on Computer Vision (ICCV), 2021
Yuxin Fang
Shusheng Yang
Xinggang Wang
Yu Li
Chen Fang
Ying Shan
Bin Feng
Wenyu Liu
ISeg
367
311
0
05 May 2021
TransHash: Transformer-based Hamming Hashing for Efficient Image
  Retrieval
TransHash: Transformer-based Hamming Hashing for Efficient Image RetrievalInternational Conference on Multimedia Retrieval (ICMR), 2021
Yongbiao Chen
Shenmin Zhang
Fangxin Liu
Zhigang Chang
Mang Ye
Zhengwei Qi Shanghai Jiao Tong University
ViT
173
71
0
05 May 2021
AGMB-Transformer: Anatomy-Guided Multi-Branch Transformer Network for
  Automated Evaluation of Root Canal Therapy
AGMB-Transformer: Anatomy-Guided Multi-Branch Transformer Network for Automated Evaluation of Root Canal TherapyIEEE journal of biomedical and health informatics (JBHI), 2021
Yunxiang Li
G. Zeng
Yifan Zhang
Jun Wang
Qianni Zhang
...
Neng Xia
Ruizi Peng
Kai Tang
Yaqi Wang
Shuai Wang
MedImAI4CE
302
33
0
02 May 2021
SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale
  Place Recognition
SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2021
Zhaoxin Fan
Zhenbo Song
Hongyan Liu
Zhiwu Lu
Jun He
Xiaoyong Du
3DPCViT
445
90
0
01 May 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Twins: Revisiting the Design of Spatial Attention in Vision TransformersNeural Information Processing Systems (NeurIPS), 2021
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
666
1,232
0
28 Apr 2021
Self-distillation with Batch Knowledge Ensembling Improves ImageNet
  Classification
Self-distillation with Batch Knowledge Ensembling Improves ImageNet Classification
Yixiao Ge
Xiao Zhang
Ching Lam Choi
Ka Chun Cheung
Peipei Zhao
Feng Zhu
Xiaogang Wang
Rui Zhao
Jiaming Song
FedMLUQCV
331
36
0
27 Apr 2021
Vision Transformers with Patch Diversification
Vision Transformers with Patch Diversification
Chengyue Gong
Dilin Wang
Meng Li
Vikas Chandra
Qiang Liu
ViT
257
68
0
26 Apr 2021
Visformer: The Vision-friendly Transformer
Visformer: The Vision-friendly TransformerIEEE International Conference on Computer Vision (ICCV), 2021
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
536
275
0
26 Apr 2021
A Novel Transformer Based Semantic Segmentation Scheme for
  Fine-Resolution Remote Sensing Images
A Novel Transformer Based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing ImagesIEEE Geoscience and Remote Sensing Letters (GRSL), 2021
Libo Wang
Rui Li
Chenxi Duan
Ce Zhang
Xiaoliang Meng
Shenghui Fang
ViT
436
350
0
25 Apr 2021
A Survey of Modern Deep Learning based Object Detection Models
A Survey of Modern Deep Learning based Object Detection Models
Syed Sahil Abbas Zaidi
M. S. Ansari
Asra Aslam
N. Kanwal
M. Asghar
Brian Lee
VLMObjD
357
846
0
24 Apr 2021
VidTr: Video Transformer Without Convolutions
VidTr: Video Transformer Without ConvolutionsIEEE International Conference on Computer Vision (ICCV), 2021
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
429
220
0
23 Apr 2021
Multiscale Vision Transformers
Multiscale Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
483
1,527
0
22 Apr 2021
All Tokens Matter: Token Labeling for Training Better Vision
  Transformers
All Tokens Matter: Token Labeling for Training Better Vision TransformersNeural Information Processing Systems (NeurIPS), 2021
Zihang Jiang
Qibin Hou
Li-xin Yuan
Daquan Zhou
Yujun Shi
Xiaojie Jin
Anran Wang
Jiashi Feng
ViT
403
237
0
22 Apr 2021
Generative Transformer for Accurate and Reliable Salient Object
  Detection
Generative Transformer for Accurate and Reliable Salient Object Detection
Yuxin Mao
Jing Zhang
Zhexiong Wan
Yuchao Dai
Aixuan Li
Yun-Qiu Lv
Xinyu Tian
Deng-Ping Fan
Nick Barnes
ViT
461
50
0
20 Apr 2021
CTNet: Context-based Tandem Network for Semantic Segmentation
CTNet: Context-based Tandem Network for Semantic SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Zechao Li
Yanpeng Sun
Jinhui Tang
168
225
0
20 Apr 2021
Vision Transformer Pruning
Vision Transformer Pruning
Mingjian Zhu
Yehui Tang
Kai Han
ViT
494
113
0
17 Apr 2021
Co-Scale Conv-Attentional Image Transformers
Co-Scale Conv-Attentional Image TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Weijian Xu
Yifan Xu
Tyler A. Chang
Zhuowen Tu
ViT
287
437
0
13 Apr 2021
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin Heo
Y. Choi
Young-Woon Lee
Byung-Gyu Kim
ViT
188
72
0
03 Apr 2021
Bridging Global Context Interactions for High-Fidelity Image Completion
Bridging Global Context Interactions for High-Fidelity Image CompletionComputer Vision and Pattern Recognition (CVPR), 2021
Chuanxia Zheng
Tat-Jen Cham
Jianfei Cai
Dinh Q. Phung
ViT
175
102
0
02 Apr 2021
Previous
123...169170171
Next
Page 170 of 171
Pageof 171