Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.14030
Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"
50 / 2,048 papers shown
Title
Representation Compensation Networks for Continual Semantic Segmentation
Chang-Bin Zhang
Jianqiang Xiao
Xialei Liu
Ying-Cong Chen
Mingg-Ming Cheng
SSeg
CLL
16
93
0
10 Mar 2022
Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking
Boyu Chen
Peixia Li
Lei Bai
Leixian Qiao
Qiuhong Shen
Bo-wen Li
Weihao Gan
Wei Wu
Wanli Ouyang
ViT
VOT
20
182
0
10 Mar 2022
GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains
Lei Fan
Yiwen Ding
Dongdong Fan
Donglin Di
M. Pagnucco
Yang Song
AI4TS
14
19
0
10 Mar 2022
Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability
Ruifei He
Shuyang Sun
Jihan Yang
Song Bai
Xiaojuan Qi
19
35
0
10 Mar 2022
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
Peihao Wang
Wenqing Zheng
Tianlong Chen
Zhangyang Wang
ViT
9
127
0
09 Mar 2022
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Jiaming Zhang
Huayao Liu
Kailun Yang
Xinxin Hu
Ruiping Liu
Rainer Stiefelhagen
ViT
21
295
0
09 Mar 2022
Evaluation of YOLO Models with Sliced Inference for Small Object Detection
Muhammed Can Keles
Batuhan Salmanoglu
M. Güzel
Baran Gursoy
Gazi Erkan Bostancı
ObjD
15
11
0
09 Mar 2022
Region-Aware Face Swapping
Chao Xu
Jiangning Zhang
Miao Hua
Qian He
Zili Yi
Yong Liu
CVBM
17
48
0
09 Mar 2022
A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation
Yutong Chen
Fangyun Wei
Xiao Sun
Zhirong Wu
Stephen Lin
SLR
20
97
0
08 Mar 2022
RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation
Hao He
Yuhui Yuan
Xiangyu Yue
Han Hu
VOS
VLM
19
13
0
08 Mar 2022
DuMLP-Pin: A Dual-MLP-dot-product Permutation-invariant Network for Set Feature Extraction
Jiajun Fei
Ziyu Zhu
Wenlei Liu
Zhidong Deng
Mingyang Li
Huanjun Deng
Shuo Zhang
3DPC
8
6
0
08 Mar 2022
Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers
H. Chae
Seunghwan Lee
H. Son
Seungjae Han
T. Lim
MedIm
20
3
0
08 Mar 2022
SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech
Weidong Chen
Xiaofen Xing
Xiangmin Xu
Jianxin Pang
Lan Du
17
34
0
08 Mar 2022
CrowdFormer: Weakly-supervised Crowd counting with Improved Generalizability
Siddharth Singh Savner
Vivek Kanhangad
ViT
19
31
0
07 Mar 2022
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
34
1,367
0
07 Mar 2022
CNN self-attention voice activity detector
Amit Sofer
Shlomo E. Chazan
10
8
0
06 Mar 2022
PanFormer: a Transformer Based Model for Pan-sharpening
Huanyu Zhou
Qingjie Liu
Yunhong Wang
ViT
20
42
0
06 Mar 2022
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
Qishuai Diao
Yi-Xin Jiang
Bin Wen
Jianxiang Sun
Zehuan Yuan
22
60
0
05 Mar 2022
Boosting Crowd Counting via Multifaceted Attention
Hui Lin
Zhiheng Ma
Rongrong Ji
Yaowei Wang
Xiaopeng Hong
23
145
0
05 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
22
159
0
04 Mar 2022
F2DNet: Fast Focal Detection Network for Pedestrian Detection
Abdul Hannan Khan
Mohsin Munir
L. V. Elst
Andreas Dengel
ObjD
14
24
0
04 Mar 2022
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network
Zhigang Jiang
Zhongzheng Xiang
Jinhua Xu
Mingbi Zhao
ViT
3DV
11
34
0
03 Mar 2022
Correlation-Aware Deep Tracking
Fei Xie
Chunyu Wang
Guangting Wang
Yue Cao
Wankou Yang
Wenjun Zeng
VOT
19
118
0
03 Mar 2022
Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work
Khawar Islam
ViT
24
44
0
03 Mar 2022
NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation
Weihao Yuan
Xiaodong Gu
Zuozhuo Dai
Siyu Zhu
Ping Tan
23
172
0
03 Mar 2022
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation
Jiaming Zhang
Kailun Yang
Chaoxiang Ma
Simon Reiß
Kunyu Peng
Rainer Stiefelhagen
ViT
22
72
0
02 Mar 2022
A Unified Query-based Paradigm for Point Cloud Understanding
Zetong Yang
Li Jiang
Yanan Sun
Bernt Schiele
Jiaya Jia
3DPC
16
38
0
02 Mar 2022
What Makes Transfer Learning Work For Medical Images: Feature Reuse & Other Factors
Christos Matsoukas
Johan Fredin Haslum
Moein Sorkhei
Magnus P Soderberg
Kevin Smith
VLM
OOD
MedIm
22
84
0
02 Mar 2022
Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation
Zhaozheng Chen
Tan Wang
Xiongwei Wu
Xiansheng Hua
Hanwang Zhang
Qianru Sun
WSOL
VLM
18
142
0
02 Mar 2022
TransDARC: Transformer-based Driver Activity Recognition with Latent Space Feature Calibration
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
27
32
0
02 Mar 2022
Recent, rapid advancement in visual question answering architecture: a review
V. Kodali
Daniel Berleant
27
9
0
02 Mar 2022
3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification
Dening Lu
Qian Xie
Linlin Xu
Jonathan Li
3DV
16
67
0
02 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
39
14
0
01 Mar 2022
Spatio-temporal Vision Transformer for Super-resolution Microscopy
Charles N Christensen
M. Lu
Edward N. Ward
Pietro Lio'
C. Kaminski
19
8
0
28 Feb 2022
SUNet: Swin Transformer UNet for Image Denoising
Chi-Mao Fan
Tsung-Jung Liu
Kuan-Hsien Liu
ViT
27
111
0
28 Feb 2022
Real-World Blind Super-Resolution via Feature Matching with Implicit High-Resolution Priors
Chaofeng Chen
Xinyu Shi
Yipeng Qin
Xiaoming Li
Xiaoguang Han
Taojiannan Yang
Shihui Guo
17
113
0
26 Feb 2022
Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance
Zhuoning Yuan
Yuexin Wu
Zi-qi Qiu
Xianzhi Du
Lijun Zhang
Denny Zhou
Tianbao Yang
25
26
0
24 Feb 2022
Factorizer: A Scalable Interpretable Approach to Context Modeling for Medical Image Segmentation
Pooya Ashtari
Diana Sima
L. De Lathauwer
D. Sappey-Marinier
F. Maes
Sabine Van Huffel
ViT
MedIm
17
35
0
24 Feb 2022
Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse Occlusions
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
19
28
0
23 Feb 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
X. Wang
ViT
VLM
180
499
0
22 Feb 2022
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
21
211
0
17 Feb 2022
ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification
Thomas Stegmüller
Behzad Bozorgtabar
A. Spahr
Jean-Philippe Thiran
ViT
MedIm
19
42
0
15 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
19
34
0
14 Feb 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
32
86
0
14 Feb 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
25
463
0
14 Feb 2022
Particle Transformer for Jet Tagging
H. Qu
Congqiao Li
Sitian Qian
ViT
MedIm
17
96
0
08 Feb 2022
GLPanoDepth: Global-to-Local Panoramic Depth Estimation
Jia-Chi Bai
Shuichang Lai
Haoyu Qin
Jie Guo
Yanwen Guo
ViT
MDE
56
21
0
06 Feb 2022
Webly Supervised Concept Expansion for General Purpose Vision Models
Amita Kamath
Christopher Clark
Tanmay Gupta
Eric Kolve
Derek Hoiem
Aniruddha Kembhavi
VLM
19
54
0
04 Feb 2022
Image-to-Image MLP-mixer for Image Reconstruction
Youssef Mansour
Kang Lin
Reinhard Heckel
SupR
20
14
0
04 Feb 2022
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
Amin Ghiasi
Hamid Kazemi
Steven Reich
Chen Zhu
Micah Goldblum
Tom Goldstein
29
15
0
31 Jan 2022
Previous
1
2
3
...
36
37
38
39
40
41
Next