Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.14030
Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"
50 / 1,659 papers shown
Title
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation
Jiaming Zhang
Kailun Yang
Chaoxiang Ma
Simon Reiß
Kunyu Peng
Rainer Stiefelhagen
ViT
16
72
0
02 Mar 2022
A Unified Query-based Paradigm for Point Cloud Understanding
Zetong Yang
Li Jiang
Yanan Sun
Bernt Schiele
Jiaya Jia
3DPC
11
38
0
02 Mar 2022
What Makes Transfer Learning Work For Medical Images: Feature Reuse & Other Factors
Christos Matsoukas
Johan Fredin Haslum
Moein Sorkhei
Magnus P Soderberg
Kevin Smith
VLM
OOD
MedIm
22
84
0
02 Mar 2022
Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation
Zhaozheng Chen
Tan Wang
Xiongwei Wu
Xiansheng Hua
Hanwang Zhang
Qianru Sun
WSOL
VLM
16
141
0
02 Mar 2022
TransDARC: Transformer-based Driver Activity Recognition with Latent Space Feature Calibration
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
27
32
0
02 Mar 2022
Recent, rapid advancement in visual question answering architecture: a review
V. Kodali
Daniel Berleant
27
9
0
02 Mar 2022
3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification
Dening Lu
Qian Xie
Linlin Xu
Jonathan Li
3DV
16
65
0
02 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
39
14
0
01 Mar 2022
Spatio-temporal Vision Transformer for Super-resolution Microscopy
Charles N Christensen
M. Lu
Edward N. Ward
Pietro Lio'
C. Kaminski
11
8
0
28 Feb 2022
SUNet: Swin Transformer UNet for Image Denoising
Chi-Mao Fan
Tsung-Jung Liu
Kuan-Hsien Liu
ViT
27
111
0
28 Feb 2022
Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance
Zhuoning Yuan
Yuexin Wu
Zi-qi Qiu
Xianzhi Du
Lijun Zhang
Denny Zhou
Tianbao Yang
22
26
0
24 Feb 2022
Factorizer: A Scalable Interpretable Approach to Context Modeling for Medical Image Segmentation
Pooya Ashtari
Diana Sima
L. De Lathauwer
D. Sappey-Marinier
F. Maes
Sabine Van Huffel
ViT
MedIm
15
35
0
24 Feb 2022
Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse Occlusions
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
19
27
0
23 Feb 2022
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
21
210
0
17 Feb 2022
ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification
Thomas Stegmüller
Behzad Bozorgtabar
A. Spahr
Jean-Philippe Thiran
ViT
MedIm
19
42
0
15 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
19
34
0
14 Feb 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
25
462
0
14 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
8
88
0
31 Jan 2022
You Only Cut Once: Boosting Data Augmentation with a Single Cut
Junlin Han
Pengfei Fang
Weihong Li
Jie Hong
M. Armin
Ian Reid
L. Petersson
Hongdong Li
25
27
0
28 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
142
360
0
24 Jan 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?
Nenad Tomašev
Ioana Bica
Brian McWilliams
Lars Buesing
Razvan Pascanu
Charles Blundell
Jovana Mitrović
SSL
58
80
0
13 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
31
235
0
12 Jan 2022
Knee Cartilage Defect Assessment by Graph Representation and Surface Convolution
Zixu Zhuang
Liping Si
Sheng Wang
Kai Xuan
Xi Ouyang
...
Zhong Xue
Lichi Zhang
D. Shen
Weiwu Yao
Qian Wang
27
5
0
12 Jan 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
40
4,920
0
10 Jan 2022
QuadTree Attention for Vision Transformers
Shitao Tang
Jiahui Zhang
Siyu Zhu
Ping Tan
ViT
157
154
0
08 Jan 2022
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images
Ali Hatamizadeh
V. Nath
Yucheng Tang
Dong Yang
H. Roth
Daguang Xu
ViT
MedIm
17
1,015
0
04 Jan 2022
PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture
Kai Han
Jianyuan Guo
Yehui Tang
Yunhe Wang
ViT
16
22
0
04 Jan 2022
RFormer: Transformer-based Generative Adversarial Network for Real Fundus Image Restoration on A New Clinical Benchmark
Zhuo Deng
Yuanhao Cai
Lu Chen
Zheng Gong
Qiqi Bao
Xue Yao
D. Fang
Shaochong Zhang
Lan Ma
ViT
MedIm
18
52
0
03 Jan 2022
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Sitong Wu
Tianyi Wu
Hao Hao Tan
G. Guo
ViT
23
70
0
28 Dec 2021
Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction
Jing Zhang
Jianwen Xie
Nick Barnes
Ping Li
ViT
35
90
0
27 Dec 2021
Learning Cross-Scale Weighted Prediction for Efficient Neural Video Compression
Zongyu Guo
Runsen Feng
Zhizheng Zhang
Xin Jin
Zhibo Chen
19
14
0
26 Dec 2021
Raw Produce Quality Detection with Shifted Window Self-Attention
Oh Joon Kwon
Byungsoo Kim
Youngduck Choi
ViT
11
0
0
24 Dec 2021
SeMask: Semantically Masked Transformers for Semantic Segmentation
Jitesh Jain
Anukriti Singh
Nikita Orlov
Zilong Huang
Jiachen Li
Steven Walton
Humphrey Shi
ViT
24
92
0
23 Dec 2021
iSegFormer: Interactive Segmentation via Transformers with Application to 3D Knee MR Images
Qin Liu
Zhenlin Xu
Yining Jiao
Marc Niethammer
ViT
MedIm
34
35
0
21 Dec 2021
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
Xiaohan Ding
Honghao Chen
X. Zhang
Jungong Han
Guiguang Ding
14
68
0
21 Dec 2021
3D Instance Segmentation of MVS Buildings
Jiazhou Chen
Yanghui Xu
Shufang Lu
Ronghua Liang
Liangliang Nan
ISeg
3DV
10
23
0
18 Dec 2021
A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation
Wuyang Chen
Xianzhi Du
Fan Yang
Lucas Beyer
Xiaohua Zhai
...
Huizhong Chen
Jing Li
Xiaodan Song
Zhangyang Wang
Denny Zhou
ViT
21
20
0
17 Dec 2021
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
8
79
0
17 Dec 2021
Towards End-to-End Image Compression and Analysis with Transformers
Yuanchao Bai
Xu Yang
Xianming Liu
Junjun Jiang
Yaowei Wang
Xiangyang Ji
Wen Gao
ViT
18
51
0
17 Dec 2021
HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images
A. Athar
Jonathon Luiten
Alexander Hermans
Deva Ramanan
Bastian Leibe
VOS
22
25
0
16 Dec 2021
Co-training Transformer with Videos and Images Improves Action Recognition
Bowen Zhang
Jiahui Yu
Christopher Fifty
Wei Han
Andrew M. Dai
Ruoming Pang
Fei Sha
ViT
20
54
0
14 Dec 2021
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
21
21
0
09 Dec 2021
Recurrent Glimpse-based Decoder for Detection with Transformer
Zhe Chen
Jing Zhang
Dacheng Tao
ViT
11
27
0
09 Dec 2021
Improving Image Restoration by Revisiting Global Information Aggregation
Xiaojie Chu
Liangyu Chen
Chengpeng Chen
Xin Lu
17
86
0
08 Dec 2021
Visual Object Tracking with Discriminative Filters and Siamese Networks: A Survey and Outlook
S. Javed
Martin Danelljan
F. Khan
Muhammad Haris Khan
M. Felsberg
Jirí Matas
32
126
0
06 Dec 2021
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
A. Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
17
2,245
0
02 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
46
671
0
02 Dec 2021
Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks
Xizhou Zhu
Jinguo Zhu
Hao Li
Xiaoshi Wu
Xiaogang Wang
Hongsheng Li
Xiaohua Wang
Jifeng Dai
36
126
0
02 Dec 2021
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLM
CLIP
34
546
0
02 Dec 2021
Previous
1
2
3
...
30
31
32
33
34
Next