ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
v1v2 (latest)

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

IEEE International Conference on Computer Vision (ICCV), 2021
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
    ViT
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)Github (14835★)

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 8,525 papers shown
An Information Theory-inspired Strategy for Automatic Network Pruning
An Information Theory-inspired Strategy for Automatic Network Pruning
Xiawu Zheng
Yuexiao Ma
Teng Xi
Gang Zhang
Errui Ding
Yuchao Li
Jie Chen
Yonghong Tian
Rongrong Ji
628
19
0
19 Aug 2021
Boosting Salient Object Detection with Transformer-based Asymmetric
  Bilateral U-Net
Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net
Yu Qiu
Yun-Hai Liu
Le Zhang
Jing Xu
ViT
271
44
0
17 Aug 2021
Fully Convolutional Networks for Panoptic Segmentation with Point-based
  Supervision
Fully Convolutional Networks for Panoptic Segmentation with Point-based Supervision
Yanwei Li
Hengshuang Zhao
Xiaojuan Qi
Yukang Chen
Lu Qi
Liwei Wang
Zeming Li
Jian Sun
Jiaya Jia
208
59
0
17 Aug 2021
Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in
  Attention Mechanism
Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism
Shulun Wang
Yinan Han
Yifan Zhang
265
18
0
16 Aug 2021
FaPN: Feature-aligned Pyramid Network for Dense Image Prediction
FaPN: Feature-aligned Pyramid Network for Dense Image Prediction
Shihua Huang
Zhichao Lu
Ran Cheng
Cheng He
183
246
0
16 Aug 2021
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers
B. Dong
Wenhai Wang
Deng-Ping Fan
Jinpeng Li
Huazhu Fu
Ling Shao
ViTMedIm
766
466
0
16 Aug 2021
PVT: Point-Voxel Transformer for Point Cloud Learning
PVT: Point-Voxel Transformer for Point Cloud LearningInternational Journal of Intelligent Systems (IJIS), 2021
Cheng Zhang
Haocheng Wan
Xinyi Shen
Zizhao Wu
3DPCViT
350
111
0
13 Aug 2021
TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation
TVT: Transferable Vision Transformer for Unsupervised Domain AdaptationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Jinyu Yang
Jingjing Liu
N. Xu
Junzhou Huang
285
162
0
12 Aug 2021
Mobile-Former: Bridging MobileNet and Transformer
Mobile-Former: Bridging MobileNet and TransformerComputer Vision and Pattern Recognition (CVPR), 2021
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Xiyang Dai
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
864
627
0
12 Aug 2021
ConvNets vs. Transformers: Whose Visual Representations are More
  Transferable?
ConvNets vs. Transformers: Whose Visual Representations are More Transferable?
Hong-Yu Zhou
Chi-Ken Lu
Sibei Yang
Yizhou Yu
ViT
187
68
0
11 Aug 2021
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial
  Locality?
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?Asian Conference on Computer Vision (ACCV), 2021
Yuki Tatsunami
Masato Taki
216
12
0
09 Aug 2021
TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer
  Embedding Network
TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding NetworkACM Multimedia (ACM MM), 2021
Zhengyi Liu
Yuan Wang
Zhengzheng Tu
Yun Xiao
Bin Tang
ViT
311
166
0
09 Aug 2021
One-Shot Object Affordance Detection in the Wild
One-Shot Object Affordance Detection in the WildInternational Journal of Computer Vision (IJCV), 2021
Wei Zhai
Hongcheng Luo
Jing Zhang
Yang Cao
Dacheng Tao
256
52
0
08 Aug 2021
Unifying Nonlocal Blocks for Neural Networks
Unifying Nonlocal Blocks for Neural NetworksIEEE International Conference on Computer Vision (ICCV), 2021
Lei Zhu
Qi She
Duo Li
Yanye Lu
Xuejing Kang
Jie Hu
Changhu Wang
168
25
0
05 Aug 2021
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Yifan Xu
Zhijie Zhang
Mengdan Zhang
Kekai Sheng
Ke Li
Weiming Dong
Liqing Zhang
Changsheng Xu
Xing Sun
ViT
374
263
0
03 Aug 2021
I3CL:Intra- and Inter-Instance Collaborative Learning for
  Arbitrary-shaped Scene Text Detection
I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
Bo Du
Jian Ye
Jing Zhang
Juhua Liu
Dacheng Tao
VLM
261
45
0
03 Aug 2021
Dynamic Feature Regularized Loss for Weakly Supervised Semantic
  Segmentation
Dynamic Feature Regularized Loss for Weakly Supervised Semantic Segmentation
Bingfeng Zhang
Jimin Xiao
Yao Zhao
310
17
0
03 Aug 2021
S$^2$-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
S2^22-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
131
64
0
02 Aug 2021
Congested Crowd Instance Localization with Dilated Convolutional Swin
  Transformer
Congested Crowd Instance Localization with Dilated Convolutional Swin Transformer
Junyuan Gao
Maoguo Gong
Xuelong Li
ViT
205
54
0
02 Aug 2021
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale
  Attention
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale AttentionInternational Conference on Learning Representations (ICLR), 2021
Wenxiao Wang
Lulian Yao
Long Chen
Binbin Lin
Deng Cai
Xiaofei He
Wei Liu
508
344
0
31 Jul 2021
Multi-Head Self-Attention via Vision Transformer for Zero-Shot Learning
Multi-Head Self-Attention via Vision Transformer for Zero-Shot Learning
Faisal Alamri
Anjan Dutta
ViT
120
36
0
30 Jul 2021
DPT: Deformable Patch-based Transformer for Visual Recognition
DPT: Deformable Patch-based Transformer for Visual RecognitionACM Multimedia (ACM MM), 2021
Zhiyang Chen
Yousong Zhu
Honghui Dong
Guosheng Hu
Wei Zeng
Jinqiao Wang
Ming Tang
ViT
194
121
0
30 Jul 2021
Real-time Streaming Perception System for Autonomous Driving
Real-time Streaming Perception System for Autonomous DrivingACM Cloud and Autonomic Computing Conference (CAC), 2021
Yongxiang Gu
Qianlei Wang
Xiaolin Qin
121
10
0
30 Jul 2021
Open-World Entity Segmentation
Open-World Entity SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Lu Qi
Jason Kuen
Yi Wang
Jiuxiang Gu
Hengshuang Zhao
Zhe Lin
Juil Sock
Jiaya Jia
OCLSSegVLM
335
99
0
29 Jul 2021
Rethinking and Improving Relative Position Encoding for Vision
  Transformer
Rethinking and Improving Relative Position Encoding for Vision TransformerIEEE International Conference on Computer Vision (ICCV), 2021
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
ViT
316
404
0
29 Jul 2021
A Unified Efficient Pyramid Transformer for Semantic Segmentation
A Unified Efficient Pyramid Transformer for Semantic Segmentation
Fangrui Zhu
Yi Zhu
Li Zhang
Chongruo Wu
Yanwei Fu
Mu Li
ViT
188
33
0
29 Jul 2021
Bridging Gap between Image Pixels and Semantics via Supervision: A
  Survey
Bridging Gap between Image Pixels and Semantics via Supervision: A SurveyAPSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jiali Duan
C.-C. Jay Kuo
328
9
0
29 Jul 2021
Improving Video Instance Segmentation via Temporal Pyramid Routing
Improving Video Instance Segmentation via Temporal Pyramid RoutingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Xiangtai Li
Hao He
Yibo Yang
Henghui Ding
Kuiyuan Yang
Guangliang Cheng
Yunhai Tong
Jianping Shi
VOS
212
14
0
28 Jul 2021
Contextual Transformer Networks for Visual Recognition
Contextual Transformer Networks for Visual RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yehao Li
Ting Yao
Yingwei Pan
Tao Mei
ViT
242
613
0
26 Jul 2021
Query2Label: A Simple Transformer Way to Multi-Label Classification
Query2Label: A Simple Transformer Way to Multi-Label Classification
Shilong Liu
Lei Zhang
Xiao Yang
Hang Su
Jun Zhu
200
238
0
22 Jul 2021
A Deep Learning-based Quality Assessment and Segmentation System with a
  Large-scale Benchmark Dataset for Optical Coherence Tomographic Angiography
  Image
A Deep Learning-based Quality Assessment and Segmentation System with a Large-scale Benchmark Dataset for Optical Coherence Tomographic Angiography Image
Yu-Fang Wang
Yiqing Shen
Meng Yuan
Jing Xu
B. Yang
Chicheng Liu
Wenjia Cai
Weijing Cheng
Wei Wang
173
22
0
22 Jul 2021
CycleMLP: A MLP-like Architecture for Dense Prediction
CycleMLP: A MLP-like Architecture for Dense PredictionInternational Conference on Learning Representations (ICLR), 2021
Shoufa Chen
Enze Xie
Chongjian Ge
Runjian Chen
Ding Liang
Ping Luo
403
253
0
21 Jul 2021
OODformer: Out-Of-Distribution Detection Transformer
OODformer: Out-Of-Distribution Detection TransformerBritish Machine Vision Conference (BMVC), 2021
Rajat Koner
Poulami Sinhamahapatra
Karsten Roscher
Stephan Günnemann
Volker Tresp
ViT
124
42
0
19 Jul 2021
LeViT-UNet: Make Faster Encoders with Transformer for Medical Image
  Segmentation
LeViT-UNet: Make Faster Encoders with Transformer for Medical Image SegmentationChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2021
Guoping Xu
Xingrong Wu
Xuan Zhang
Xinwei He
ViTMedIm
262
270
0
19 Jul 2021
YOLOX: Exceeding YOLO Series in 2021
YOLOX: Exceeding YOLO Series in 2021
Zheng Ge
Songtao Liu
Feng Wang
Zeming Li
Jian Sun
ObjD
670
5,361
0
18 Jul 2021
AS-MLP: An Axial Shifted MLP Architecture for Vision
AS-MLP: An Axial Shifted MLP Architecture for VisionInternational Conference on Learning Representations (ICLR), 2021
Dongze Lian
Zehao Yu
Xing Sun
Shenghua Gao
274
213
0
18 Jul 2021
Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR
  2021 OmniCV Workshop Challenge
Woodscape Fisheye Semantic Segmentation for Autonomous Driving -- CVPR 2021 OmniCV Workshop Challenge
Saravanabalagi Ramachandran
Ganesh Sistu
J. McDonald
S. Yogamani
180
13
0
17 Jul 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLMViT
601
1,876
0
13 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Juil Sock
235
28
0
13 Jul 2021
Locally Enhanced Self-Attention: Combining Self-Attention and
  Convolution as Local and Context Terms
Locally Enhanced Self-Attention: Combining Self-Attention and Convolution as Local and Context Terms
Chenglin Yang
Siyuan Qiao
Adam Kortylewski
Alan Yuille
273
4
0
12 Jul 2021
Visual Transformer with Statistical Test for COVID-19 Classification
Visual Transformer with Statistical Test for COVID-19 Classification
Chih-Chung Hsu
Guan-Lin Chen
Mei-Hsuan Wu
ViTMedIm
165
16
0
12 Jul 2021
TransClaw U-Net: Claw U-Net with Transformers for Medical Image
  Segmentation
TransClaw U-Net: Claw U-Net with Transformers for Medical Image Segmentation
Yao Chang
Menghan Hu
Zhai Guangtao
Xiao-Ping Zhang
MedImViT
209
120
0
12 Jul 2021
Local-to-Global Self-Attention in Vision Transformers
Local-to-Global Self-Attention in Vision Transformers
Jinpeng Li
Manwen Liao
Tianran Ouyang
Xiaokang Yang
Ling Shao
ViT
125
35
0
10 Jul 2021
ViTGAN: Training GANs with Vision Transformers
ViTGAN: Training GANs with Vision TransformersInternational Conference on Learning Representations (ICLR), 2021
Kwonjoon Lee
Huiwen Chang
Lu Jiang
Han Zhang
Zhuowen Tu
Ce Liu
ViT
353
220
0
09 Jul 2021
Modality specific U-Net variants for biomedical image segmentation: A
  survey
Modality specific U-Net variants for biomedical image segmentation: A surveyArtificial Intelligence Review (AIR), 2021
Narinder Singh Punn
Sonali Agarwal
SSeg
379
198
0
09 Jul 2021
Trans4Trans: Efficient Transformer for Transparent Object Segmentation
  to Help Visually Impaired People Navigate in the Real World
Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World
Kailai Li
Kailun Yang
Angela Constantinescu
Kunyu Peng
Karin Muller
Rainer Stiefelhagen
ViT
283
72
0
07 Jul 2021
Integrating Large Circular Kernels into CNNs through Neural Architecture
  Search
Integrating Large Circular Kernels into CNNs through Neural Architecture Search
Kun He
Chao Li
Yixiao Yang
Gao Huang
John E. Hopcroft
309
12
0
06 Jul 2021
Long-Short Transformer: Efficient Transformers for Language and Vision
Long-Short Transformer: Efficient Transformers for Language and Vision
Chen Zhu
Ming-Yu Liu
Chaowei Xiao
Mohammad Shoeybi
Tom Goldstein
Anima Anandkumar
Bryan Catanzaro
ViTVLM
442
161
0
05 Jul 2021
What Makes for Hierarchical Vision Transformer?
What Makes for Hierarchical Vision Transformer?
Yuxin Fang
Xinggang Wang
Rui Wu
Wenyu Liu
ViT
153
11
0
05 Jul 2021
SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from
  UAV Images
SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from UAV Images
Ming Hong
Shuiwang Li
Yuchao Yang
Feiyu Zhu
Qijun Zhao
Li Lu
ObjD
152
124
0
04 Jul 2021
Previous
123...166167168169170171
Next
Page 167 of 171
Pageof 171