Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.14030
Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"
50 / 2,186 papers shown
Title
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
Boris Knyazev
Doha Hwang
Simon Lacoste-Julien
AI4CE
24
17
0
07 Mar 2023
iBall: Augmenting Basketball Videos with Gaze-moderated Embedded Visualizations
Zhutian Chen
Qisen Yang
Jiarui Shan
Tica Lin
Johanna Beyer
Haijun Xia
Hanspeter Pfister
19
28
0
06 Mar 2023
DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation
Md Awsafur Rahman
S. Fattah
ViT
MDE
30
4
0
06 Mar 2023
CTG-Net: An Efficient Cascaded Framework Driven by Terminal Guidance Mechanism for Dilated Pancreatic Duct Segmentation
Liwen Zou
Zhen-zhai Cai
Y. Qiu
Luying Gui
L. Mao
Xiaoping Yang
MedIm
19
6
0
06 Mar 2023
Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent
Xiaonan Nie
Yi Liu
Fangcheng Fu
J. Xue
Dian Jiao
Xupeng Miao
Yangyu Tao
Bin Cui
MoE
19
16
0
06 Mar 2023
Training-Free Acceleration of ViTs with Delayed Spatial Merging
J. Heo
Seyedarmin Azizi
A. Fayyazi
Massoud Pedram
36
3
0
04 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
158
214
0
03 Mar 2023
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Renrui Zhang
Xiangfei Hu
Bohao Li
Siyuan Huang
Hanqiu Deng
Hongsheng Li
Yu Qiao
Peng Gao
VLM
MLLM
30
170
0
03 Mar 2023
Depth-based 6DoF Object Pose Estimation using Swin Transformer
Zhujun Li
I. Stamos
ViT
22
11
0
03 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
23
24
0
02 Mar 2023
Capturing the motion of every joint: 3D human pose and shape estimation with independent tokens
Sen Yang
Wen Heng
Gang Liu
Guozhong Luo
Wankou Yang
Gang Yu
3DH
ViT
18
11
0
01 Mar 2023
Applying Plain Transformers to Real-World Point Clouds
Lanxiao Li
M. Heizmann
3DPC
ViT
20
3
0
28 Feb 2023
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking
Chang-Shu Liu
Yinpeng Dong
Wenzhao Xiang
X. Yang
Hang Su
Junyi Zhu
YueFeng Chen
Yuan He
H. Xue
Shibao Zheng
OOD
VLM
AAML
17
72
0
28 Feb 2023
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations
Ziyu Jiang
Yinpeng Chen
Mengchen Liu
Dongdong Chen
Xiyang Dai
Lu Yuan
Zicheng Liu
Zhangyang Wang
SSL
VLM
CLIP
30
16
0
27 Feb 2023
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Weidong Chen
Xiaofen Xing
Xiangmin Xu
Jianxin Pang
Lan Du
30
38
0
27 Feb 2023
Can we avoid Double Descent in Deep Neural Networks?
Victor Quétu
Enzo Tartaglione
AI4CE
20
3
0
26 Feb 2023
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
S. Bhat
R. Birkl
Diana Wofk
Peter Wonka
Matthias Müller
VLM
MDE
48
483
0
23 Feb 2023
Patch Network for medical image Segmentation
Weihu Song
Heng Yu
Jianhua Wu
MedIm
SSeg
11
0
0
23 Feb 2023
Human MotionFormer: Transferring Human Motions with Vision Transformers
Hongyu Liu
Xintong Han
Chengbin Jin
Lihui Qian
Huawei Wei
...
Faqiang Wang
Haoye Dong
Yibing Song
Jia Xu
Qifeng Chen
11
10
0
22 Feb 2023
Connecting Vision and Language with Video Localized Narratives
P. Voigtlaender
Soravit Changpinyo
Jordi Pont-Tuset
Radu Soricut
V. Ferrari
VGen
31
21
0
22 Feb 2023
KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer
Kaikai Zhao
Norimichi Ukita
MU
29
1
0
22 Feb 2023
A residual dense vision transformer for medical image super-resolution with segmentation-based perceptual loss fine-tuning
Jin Zhu
Guang Yang
Pietro Lio'
ViT
MedIm
24
5
0
22 Feb 2023
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
11
0
0
21 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring
Zhihao Chen
Chuang Niu
Qi Gao
Ge Wang
Hongming Shan
MedIm
ViT
3DV
25
20
0
21 Feb 2023
Soft Error Reliability Analysis of Vision Transformers
Xing-xiong Xue
Cheng Liu
Ying Wang
Bing Yang
Tao Luo
L. Zhang
Huawei Li
Xiaowei Li
34
14
0
21 Feb 2023
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey
Kunlin Wang
Zi Wang
Zhang Li
Ang Su
Xichao Teng
Minhao Liu
Qifeng Yu
Qifeng Yu
ObjD
81
8
0
21 Feb 2023
Unsupervised Learning on a DIET: Datum IndEx as Target Free of Self-Supervision, Reconstruction, Projector Head
Randall Balestriero
38
3
0
20 Feb 2023
STB-VMM: Swin Transformer Based Video Motion Magnification
Ricard Lado-Roigé
M. A. Pérez
16
13
0
20 Feb 2023
StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation
Yining Shi
Kun Jiang
Ke Wang
Jiusi Li
Yunlong Wang
Mengmeng Yang
Diange Yang
AI4TS
30
2
0
19 Feb 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
21
176
0
19 Feb 2023
Hyneter: Hybrid Network Transformer for Object Detection
Dong Chen
Duoqian Miao
Xuepeng Zhao
ViT
27
3
0
18 Feb 2023
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer
N. H. Phong
B. Ribeiro
27
15
0
17 Feb 2023
CovidExpert: A Triplet Siamese Neural Network framework for the detection of COVID-19
Tareque Rahman Ornob
G. Roy
Enamul Hassan
19
12
0
17 Feb 2023
Less is More: The Influence of Pruning on the Explainability of CNNs
David Weber
F. Merkle
Pascal Schöttle
Stephan Schlögl
Martin Nocker
FAtt
29
1
0
17 Feb 2023
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
21
6
0
16 Feb 2023
3M3D: Multi-view, Multi-path, Multi-representation for 3D Object Detection
Jong Sung Park
Apoorv Singh
Varun Bankiti
3DPC
23
7
0
16 Feb 2023
Hierarchical Cross-modal Transformer for RGB-D Salient Object Detection
Hao Chen
Feihong Shen
ViT
29
0
0
16 Feb 2023
Offline-to-Online Knowledge Distillation for Video Instance Segmentation
H. Kim
Seunghun Lee
Sunghoon Im
OffRL
36
3
0
15 Feb 2023
From paintbrush to pixel: A review of deep neural networks in AI-generated art
Anne-Sofie Maerten
Derya Soydaner
30
22
0
14 Feb 2023
Multi-Source Contrastive Learning from Musical Audio
C. Garoufis
Athanasia Zlatintsi
Petros Maragos
19
6
0
14 Feb 2023
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
24
49
0
13 Feb 2023
Fixing Overconfidence in Dynamic Neural Networks
Lassi Meronen
Martin Trapp
Andrea Pilzer
Le Yang
Arno Solin
BDL
21
16
0
13 Feb 2023
Semantic Feature Integration network for Fine-grained Visual Classification
Haibo Wang
Yueyang Li
Haichi Luo
30
0
0
13 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
M. Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
35
56
0
12 Feb 2023
Flexible-modal Deception Detection with Audio-Visual Adapter
Zhaoxu Li
Zitong Yu
Nithish Muthuchamy Selvaraj
Xiaobao Guo
Bingquan Shen
A. Kong
Alex C. Kot
22
2
0
11 Feb 2023
Key Design Choices for Double-Transfer in Source-Free Unsupervised Domain Adaptation
Andrea Maracani
Raffaello Camoriano
Elisa Maiettini
Davide Talon
Lorenzo Rosasco
Lorenzo Natale
21
2
0
10 Feb 2023
GCNet: Probing Self-Similarity Learning for Generalized Counting Network
Mingjie Wang
Yande Li
Jun Zhou
Graham W. Taylor
Minglun Gong
21
11
0
10 Feb 2023
Making Substitute Models More Bayesian Can Enhance Transferability of Adversarial Examples
Qizhang Li
Yiwen Guo
W. Zuo
Hao Chen
AAML
19
35
0
10 Feb 2023
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
24
18
0
09 Feb 2023
Towards Geospatial Foundation Models via Continual Pretraining
Matías Mendieta
Boran Han
Xingjian Shi
Yi Zhu
Chen Chen
VLM
AI4CE
38
63
0
09 Feb 2023
Previous
1
2
3
...
23
24
25
...
42
43
44
Next