ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
    ViT
ArXivPDFHTML

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 2,048 papers shown
Title
Siamese Transition Masked Autoencoders as Uniform Unsupervised Visual
  Anomaly Detector
Siamese Transition Masked Autoencoders as Uniform Unsupervised Visual Anomaly Detector
Haiming Yao
Xue Wang
Wenyong Yu
15
9
0
01 Nov 2022
Training Vision-Language Models with Less Bimodal Supervision
Training Vision-Language Models with Less Bimodal Supervision
Elad Segal
Ben Bogin
Jonathan Berant
VLM
19
2
0
01 Nov 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming
  Speech Recognition
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
19
2
0
31 Oct 2022
Attention Swin U-Net: Cross-Contextual Attention Mechanism for Skin
  Lesion Segmentation
Attention Swin U-Net: Cross-Contextual Attention Mechanism for Skin Lesion Segmentation
Ehsan Khodapanah Aghdam
Reza Azad
Maral Zarvani
Dorit Merhof
ViT
SSeg
MedIm
26
47
0
30 Oct 2022
Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot
  Object Detection
Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection
Shan Zhang
Naila Murray
Lei Wang
Piotr Koniusz
ViT
27
16
0
30 Oct 2022
Interpretable CNN-Multilevel Attention Transformer for Rapid Recognition
  of Pneumonia from Chest X-Ray Images
Interpretable CNN-Multilevel Attention Transformer for Rapid Recognition of Pneumonia from Chest X-Ray Images
Shengchao Chen
Sufen Ren
Guanjun Wang
Mengxing Huang
Chenyang Xue
ViT
MedIm
47
16
0
29 Oct 2022
A Survey on Causal Representation Learning and Future Work for Medical
  Image Analysis
A Survey on Causal Representation Learning and Future Work for Medical Image Analysis
Chang-Tien Lu
OOD
BDL
CML
MedIm
24
0
0
28 Oct 2022
Contextual Learning in Fourier Complex Field for VHR Remote Sensing
  Images
Contextual Learning in Fourier Complex Field for VHR Remote Sensing Images
Yan Zhang
Xiyuan Gao
Qingyan Duan
Jiaxu Leng
Xiao Pu
Xinbo Gao
ViT
16
1
0
28 Oct 2022
Grafting Vision Transformers
Grafting Vision Transformers
Jong Sung Park
Kumara Kahatapitiya
Donghyun Kim
Shivchander Sudalairaj
Quanfu Fan
Michael S. Ryoo
ViT
21
2
0
28 Oct 2022
Spatio-Temporal Hybrid Fusion of CAE and SWIn Transformers for Lung
  Cancer Malignancy Prediction
Spatio-Temporal Hybrid Fusion of CAE and SWIn Transformers for Lung Cancer Malignancy Prediction
Sadaf Khademi
Shahin Heidarian
Parnian Afshar
F. Naderkhani
A. Oikonomou
Konstantinos Plataniotis
Arash Mohammadi
ViT
MedIm
17
7
0
27 Oct 2022
Deep Learning Object Detection Approaches to Signal Identification
Deep Learning Object Detection Approaches to Signal Identification
Luke Wood
K. Anderson
Peter Gerstoft
Richard Bell
Raghab Subbaraman
Dinesh Bharadia
11
2
0
27 Oct 2022
Masked Vision-Language Transformer in Fashion
Masked Vision-Language Transformer in Fashion
Ge-Peng Ji
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Christos Sakaridis
Luc Van Gool
17
25
0
27 Oct 2022
Synthetic Tumors Make AI Segment Tumors Better
Synthetic Tumors Make AI Segment Tumors Better
Qixing Hu
Junfei Xiao
Yixiong Chen
Shuwen Sun
Jieneng Chen
Alan Yuille
Zongwei Zhou
MedIm
25
11
0
26 Oct 2022
M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
  Learning with Model-Accelerator Co-design
M3^33ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zhangyang Wang
MoE
24
79
0
26 Oct 2022
SemFormer: Semantic Guided Activation Transformer for Weakly Supervised
  Semantic Segmentation
SemFormer: Semantic Guided Activation Transformer for Weakly Supervised Semantic Segmentation
Junliang Chen
Xiaodong Zhao
Cheng Luo
Linlin Shen
ViT
12
3
0
26 Oct 2022
Automatic Diagnosis of Myocarditis Disease in Cardiac MRI Modality using
  Deep Transformers and Explainable Artificial Intelligence
Automatic Diagnosis of Myocarditis Disease in Cardiac MRI Modality using Deep Transformers and Explainable Artificial Intelligence
M. Jafari
A. Shoeibi
Navid Ghassemi
Jónathan Heras
Saiguang Ling
...
Shuihua Wang
R. Alizadehsani
Juan M Gorriz
U. Acharya
Hamid Alinejad-Rokny
MedIm
11
10
0
26 Oct 2022
TPFNet: A Novel Text In-painting Transformer for Text Removal
TPFNet: A Novel Text In-painting Transformer for Text Removal
Onkar Susladkar
Dhruv Makwana
Gayatri S Deshmukh
Sparsh Mittal
R. S. Teja
Rekha Singhal
ViT
6
3
0
26 Oct 2022
Adversarially Robust Medical Classification via Attentive Convolutional
  Neural Networks
Adversarially Robust Medical Classification via Attentive Convolutional Neural Networks
I. Wasserman
OOD
MedIm
AAML
16
0
0
26 Oct 2022
Explicitly Increasing Input Information Density for Vision Transformers
  on Small Datasets
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
33
3
0
25 Oct 2022
Pointly-Supervised Panoptic Segmentation
Pointly-Supervised Panoptic Segmentation
Junsong Fan
Zhaoxiang Zhang
T. Tan
22
23
0
25 Oct 2022
End-to-end Transformer for Compressed Video Quality Enhancement
End-to-end Transformer for Compressed Video Quality Enhancement
Li Yu
Wenshuai Chang
Shiyu Wu
M. Gabbouj
ViT
16
8
0
25 Oct 2022
MetaFormer Baselines for Vision
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
23
156
0
24 Oct 2022
Deep Model Reassembly
Deep Model Reassembly
Xingyi Yang
Zhou Daquan
Songhua Liu
Jingwen Ye
Xinchao Wang
MoMe
20
120
0
24 Oct 2022
mm-Wave Radar Hand Shape Classification Using Deformable Transformers
mm-Wave Radar Hand Shape Classification Using Deformable Transformers
Athma Narayanan
Asma Beevi K. T.
Haoyang Wu
Jingyi Ma
W. Huang
8
2
0
24 Oct 2022
Gallery Filter Network for Person Search
Gallery Filter Network for Person Search
Lucas Jaffe
A. Zakhor
8
12
0
24 Oct 2022
Delving into Masked Autoencoders for Multi-Label Thorax Disease
  Classification
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao
Yutong Bai
Alan Yuille
Zongwei Zhou
MedIm
ViT
30
59
0
23 Oct 2022
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context
  Propagation in Transformers
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers
Zhuo Huang
Zhiyou Zhao
Banghuai Li
Jungong Han
3DPC
ViT
23
55
0
23 Oct 2022
S2WAT: Image Style Transfer via Hierarchical Vision Transformer using
  Strips Window Attention
S2WAT: Image Style Transfer via Hierarchical Vision Transformer using Strips Window Attention
Chi Zhang
Xiaogang Xu
Lei Wang
Zaiyan Dai
Jun Yang
ViT
22
23
0
22 Oct 2022
High-Fidelity Visual Structural Inspections through Transformers and
  Learnable Resizers
High-Fidelity Visual Structural Inspections through Transformers and Learnable Resizers
Kareem A. Eltouny
S. Sajedi
Xiao Liang
6
2
0
21 Oct 2022
Unsupervised Multi-object Segmentation by Predicting Probable Motion
  Patterns
Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns
Laurynas Karazija
Subhabrata Choudhury
Iro Laina
Christian Rupprecht
Andrea Vedaldi
OCL
98
20
0
21 Oct 2022
Face Pyramid Vision Transformer
Face Pyramid Vision Transformer
Khawar Islam
M. Zaheer
Arif Mahmood
ViT
CVBM
19
4
0
21 Oct 2022
LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal
  Modeling
LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal Modeling
Dongsheng Chen
Chaofan Tao
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
VLM
25
18
0
21 Oct 2022
Boosting vision transformers for image retrieval
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
22
31
0
21 Oct 2022
Play It Back: Iterative Attention for Audio Recognition
Play It Back: Iterative Attention for Audio Recognition
Alexandros Stergiou
Dima Damen
19
4
0
20 Oct 2022
Single Image Super-Resolution Using Lightweight Networks Based on Swin
  Transformer
Single Image Super-Resolution Using Lightweight Networks Based on Swin Transformer
Bolong Zhang
Juan Chen
Q. Wen
ViT
25
1
0
20 Oct 2022
Emerging Threats in Deep Learning-Based Autonomous Driving: A
  Comprehensive Survey
Emerging Threats in Deep Learning-Based Autonomous Driving: A Comprehensive Survey
Huiyun Cao
Wenlong Zou
Yinkun Wang
Ting Song
Mengjun Liu
AAML
35
4
0
19 Oct 2022
Using deep convolutional neural networks to classify poisonous and
  edible mushrooms found in China
Using deep convolutional neural networks to classify poisonous and edible mushrooms found in China
Baiming Zhang
Ying Zhao
Zhixiang Li
17
5
0
19 Oct 2022
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
Zifeng Wang
Zhenbang Wu
Dinesh Agarwal
Jimeng Sun
CLIP
VLM
MedIm
26
394
0
18 Oct 2022
Sequence and Circle: Exploring the Relationship Between Patches
Sequence and Circle: Exploring the Relationship Between Patches
Zhengyang Yu
Jochen Triesch
ViT
17
0
0
18 Oct 2022
1st Place Solutions for the UVO Challenge 2022
1st Place Solutions for the UVO Challenge 2022
Jiajun Zhang
Boyu Chen
Zhilong Ji
Jinfeng Bai
Zonghai Hu
12
1
0
18 Oct 2022
Transfer-learning for video classification: Video Swin Transformer on multiple domains
Transfer-learning for video classification: Video Swin Transformer on multiple domains
Daniel de Oliveira
D. Matos
ViT
24
0
0
18 Oct 2022
ITSRN++: Stronger and Better Implicit Transformer Network for Continuous
  Screen Content Image Super-Resolution
ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution
Sheng Shen
Huanjing Yue
Jingyu Yang
Kun Li
SupR
18
3
0
17 Oct 2022
Forecasting Human Trajectory from Scene History
Forecasting Human Trajectory from Scene History
Mancheng Meng
Ziyan Wu
Terrence Chen
Xiran Cai
X. Zhou
Fan Yang
Dinggang Shen
20
22
0
17 Oct 2022
Learning Self-Regularized Adversarial Views for Self-Supervised Vision
  Transformers
Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers
Tao Tang
Changlin Li
Guangrun Wang
Kaicheng Yu
Xiaojun Chang
Xiaodan Liang
ViT
16
1
0
16 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
39
9
0
14 Oct 2022
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for
  Transformers
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Hyeong Kyu Choi
Joonmyung Choi
Hyunwoo J. Kim
ViT
21
35
0
14 Oct 2022
When Adversarial Training Meets Vision Transformers: Recipes from
  Training to Architecture
When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture
Yi Mo
Dongxian Wu
Yifei Wang
Yiwen Guo
Yisen Wang
ViT
27
52
0
14 Oct 2022
SWFormer: Sparse Window Transformer for 3D Object Detection in Point
  Clouds
SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
Pei Sun
Mingxing Tan
Weiyue Wang
Chenxi Liu
Fei Xia
Zhaoqi Leng
Drago Anguelov
ViT
16
114
0
13 Oct 2022
CROWDLAB: Supervised learning to infer consensus labels and quality
  scores for data with multiple annotators
CROWDLAB: Supervised learning to infer consensus labels and quality scores for data with multiple annotators
Hui Wen Goh
Ulyana Tkachenko
Jonas W. Mueller
13
10
0
13 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State
  Spaces
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
VLM
22
37
0
12 Oct 2022
Previous
123...272829...394041
Next