Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.14030
Cited By
v1
v2 (latest)
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
IEEE International Conference on Computer Vision (ICCV), 2021
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (5 upvotes)
Github (14835★)
Papers citing
"Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"
50 / 8,525 papers shown
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
Zhiwei Hao
Jianyuan Guo
Ding Jia
Kai Han
Yehui Tang
Chao Zhang
Dacheng Tao
Yunhe Wang
ViT
442
89
0
03 Jul 2021
1st Place Solutions for UG2+ Challenge 2021 -- (Semi-)supervised Face detection in the low light condition
Pengcheng Wang
Ling Ji
Zhilong Ji
Yuan Gao
Xiao-Chang Liu
CVBM
106
0
0
02 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
803
1,244
0
01 Jul 2021
Global Filter Networks for Image Classification
Yongming Rao
Wenliang Zhao
Zheng Zhu
Jiwen Lu
Jie Zhou
ViT
304
611
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
353
502
0
01 Jul 2021
CBNet: A Composite Backbone Network Architecture for Object Detection
Tingting Liang
Xiao Chu
Yudong Liu
Yongtao Wang
Zhi Tang
Wei Chu
Jingdong Chen
Haibin Ling
ObjD
555
206
0
01 Jul 2021
Simple Training Strategies and Model Scaling for Object Detection
Xianzhi Du
Barret Zoph
Wei-Chih Hung
Nayeon Lee
ObjD
239
50
0
30 Jun 2021
Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2021
L. Ding
Dong Lin
Shaofu Lin
Jing Zhang
Xiaojie Cui
Yuebin Wang
Hao Tang
Lorenzo Bruzzone
ViT
547
130
0
29 Jun 2021
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
British Machine Vision Conference (BMVC), 2021
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
197
27
0
28 Jun 2021
Early Convolutions Help Transformers See Better
Neural Information Processing Systems (NeurIPS), 2021
Tete Xiao
Mannat Singh
Eric Mintun
Trevor Darrell
Piotr Dollár
Ross B. Girshick
377
887
0
28 Jun 2021
K-Net: Towards Unified Image Segmentation
Neural Information Processing Systems (NeurIPS), 2021
Wenwei Zhang
Jiangmiao Pang
Kai-xiang Chen
Chen Change Loy
ISeg
334
442
0
28 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
Neural Information Processing Systems (NeurIPS), 2021
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
303
518
0
28 Jun 2021
Can An Image Classifier Suffice For Action Recognition?
International Conference on Learning Representations (ICLR), 2021
Quanfu Fan
Chun-Fu Chen
Chen
Yikang Shen
ViT
291
38
0
26 Jun 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
Computational Visual Media (CVM), 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
791
2,143
0
25 Jun 2021
ViTAS: Vision Transformer Architecture Search
European Conference on Computer Vision (ECCV), 2021
Xiu Su
Shan You
Jiyang Xie
Mingkai Zheng
Haiwei Yang
Chao Qian
Changshui Zhang
Xiaogang Wang
Chang Xu
ViT
459
56
0
25 Jun 2021
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training
Hongwei Xue
Yupan Huang
Bei Liu
Houwen Peng
Jianlong Fu
Houqiang Li
Jiebo Luo
414
94
0
25 Jun 2021
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
495
1,884
0
24 Jun 2021
Exploring Corruption Robustness: Inductive Biases in Vision Transformers and MLP-Mixers
Katelyn Morrison
B. Gilby
Colton Lipchak
Adam Mattioli
Adriana Kovashka
ViT
176
17
0
24 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
424
378
0
24 Jun 2021
Advancing biological super-resolution microscopy through deep learning: a brief review
Tianjie Yang
Yaoru Luo
Wei Ji
Ge Yang
SupR
175
25
0
24 Jun 2021
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
Haixu Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
AI4TS
518
3,779
0
24 Jun 2021
IA-RED
2
^2
2
: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Yikang Shen
Lezhi Li
Zinan Lin
Rogerio Feris
A. Oliva
VLM
ViT
329
191
0
23 Jun 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
211
56
0
23 Jun 2021
Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images
Libo Wang
Rui Li
Dongzhi Wang
Chenxi Duan
Teng Wang
Xiaoliang Meng
ViT
259
208
0
23 Jun 2021
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
Qibin Hou
Zihang Jiang
Li-xin Yuan
Mingg-Ming Cheng
Shuicheng Yan
Jiashi Feng
ViT
MLLM
306
236
0
23 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
613
289
0
22 Jun 2021
Tracking Instances as Queries
Shusheng Yang
Yuxin Fang
Xinggang Wang
Yu Li
Ying Shan
Bin Feng
Wenyu Liu
175
11
0
22 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
649
155
0
21 Jun 2021
SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving
Jianhua Han
Xiwen Liang
Hang Xu
Kai Chen
Lanqing Hong
...
Chao Ye
Wei Zhang
Zhenguo Li
Xi Liang
Chunjing Xu
224
103
0
21 Jun 2021
More than Encoder: Introducing Transformer Decoder to Upsample
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2021
Yijiang Li
Wentian Cai
Ying Gao
Chengming Li
Xiping Hu
ViT
MedIm
254
75
0
20 Jun 2021
MSN: Efficient Online Mask Selection Network for Video Instance Segmentation
Vidit Goel
Jiachen Li
Shubhika Garg
Harsh Maheshwari
Humphrey Shi
231
9
0
19 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
345
776
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
International Conference on Learning Representations (ICLR), 2021
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
306
224
0
17 Jun 2021
XCiT: Cross-Covariance Image Transformers
Neural Information Processing Systems (NeurIPS), 2021
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Edouard Grave
ViT
446
614
0
17 Jun 2021
Long-Short Temporal Contrastive Learning of Video Transformers
Jue Wang
Gedas Bertasius
Du Tran
Lorenzo Torresani
VLM
ViT
348
56
0
17 Jun 2021
End-to-End Semi-Supervised Object Detection with Soft Teacher
Mengde Xu
Zheng Zhang
Han Hu
Jianfeng Wang
Lijuan Wang
Fangyun Wei
X. Bai
Zicheng Liu
350
586
0
16 Jun 2021
Shuffle Transformer with Feature Alignment for Video Face Parsing
Rui Zhang
Yang Han
Zilong Huang
Pei Cheng
Guozhong Luo
Gang Yu
Bin-Bin Fu
CVBM
ViT
181
1
0
16 Jun 2021
Temporal Convolution Networks with Positional Encoding for Evoked Expression Estimation
V. Huynh
Gueesang Lee
Hyung-Jeong Yang
Soohyung Kim
146
4
0
16 Jun 2021
ICDAR 2021 Competition on Components Segmentation Task of Document Photos
C. A. M. L. Junior
R. B. D. N. Junior
B. Bezerra
Alejandro H. Toselli
D. Impedovo
155
2
0
16 Jun 2021
Dynamic Head: Unifying Object Detection Heads with Attentions
Xiyang Dai
Yinpeng Chen
Bin Xiao
Dongdong Chen
Xiyang Dai
Lu Yuan
Lei Zhang
232
803
0
15 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
856
3,424
0
15 Jun 2021
Improved Transformer for High-Resolution GANs
Neural Information Processing Systems (NeurIPS), 2021
Long Zhao
Zizhao Zhang
Ting Chen
Dimitris N. Metaxas
Han Zhang
ViT
352
109
0
14 Jun 2021
S
2
^2
2
-MLP: Spatial-Shift MLP Architecture for Vision
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
261
219
0
14 Jun 2021
3rd Place Solution for Short-video Face Parsing Challenge
Xiao Liu
Xiaofei Si
Jiangtao Xie
CVBM
135
0
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
AI Open (AO), 2021
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
390
995
0
14 Jun 2021
Styleformer: Transformer based Generative Adversarial Networks with Style Vector
Computer Vision and Pattern Recognition (CVPR), 2021
Jeeseung Park
Younggeun Kim
ViT
314
59
0
13 Jun 2021
DS-TransUNet:Dual Swin Transformer U-Net for Medical Image Segmentation
IEEE Transactions on Instrumentation and Measurement (IEEE Trans. Instrum. Meas.), 2021
Ai-Jun Lin
Bingzhi Chen
Jiayu Xu
Zheng Zhang
Guangming Lu
ViT
MedIm
287
820
0
12 Jun 2021
1st Place Solution for YouTubeVOS Challenge 2021:Video Instance Segmentation
Thuy C. Nguyen
Tuan N. Tang
N. Phan
Chuong H. Nguyen
Masayuki Yamazaki
Masao Yamanaka
156
6
0
12 Jun 2021
MlTr: Multi-label Classification with Transformer
IEEE International Conference on Multimedia and Expo (ICME), 2021
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Nian Shi
Honglin Liu
ViT
176
58
0
11 Jun 2021
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
Computer Vision and Pattern Recognition (CVPR), 2021
Liangqiong Qu
Yuyin Zhou
Paul Pu Liang
Yingda Xia
Feifei Wang
Ehsan Adeli
L. Fei-Fei
D. Rubin
FedML
AI4CE
414
216
0
10 Jun 2021
Previous
1
2
3
...
167
168
169
170
171
Next
Page 168 of 171
Page
of 171
Go