Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

25 March 2021

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 2,186 papers shown

Title
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models? Boris Knyazev Doha Hwang Simon Lacoste-Julien AI4CE 24 17 0 07 Mar 2023
iBall: Augmenting Basketball Videos with Gaze-moderated Embedded Visualizations Zhutian Chen Qisen Yang Jiarui Shan Tica Lin Johanna Beyer Haijun Xia Hanspeter Pfister 19 28 0 06 Mar 2023
DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation Md Awsafur Rahman S. Fattah ViT MDE 30 4 0 06 Mar 2023
CTG-Net: An Efficient Cascaded Framework Driven by Terminal Guidance Mechanism for Dilated Pancreatic Duct Segmentation Liwen Zou Zhen-zhai Cai Y. Qiu Luying Gui L. Mao Xiaoping Yang MedIm 19 6 0 06 Mar 2023
Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent Xiaonan Nie Yi Liu Fangcheng Fu J. Xue Dian Jiao Xupeng Miao Yangyu Tao Bin Cui MoE 19 16 0 06 Mar 2023
Training-Free Acceleration of ViTs with Delayed Spatial Merging J. Heo Seyedarmin Azizi A. Fayyazi Massoud Pedram 36 3 0 04 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception Wenliang Zhao Yongming Rao Zuyan Liu Benlin Liu Jie Zhou Jiwen Lu ObjD VLM MDE 158 214 0 03 Mar 2023
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners Renrui Zhang Xiangfei Hu Bohao Li Siyuan Huang Hanqiu Deng Hongsheng Li Yu Qiao Peng Gao VLM MLLM 30 170 0 03 Mar 2023
Depth-based 6DoF Object Pose Estimation using Swin Transformer Zhujun Li I. Stamos ViT 22 11 0 03 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention Paria Mehrani John K. Tsotsos 23 24 0 02 Mar 2023
Capturing the motion of every joint: 3D human pose and shape estimation with independent tokens Sen Yang Wen Heng Gang Liu Guozhong Luo Wankou Yang Gang Yu 3DH ViT 18 11 0 01 Mar 2023
Applying Plain Transformers to Real-World Point Clouds Lanxiao Li M. Heizmann 3DPC ViT 20 3 0 28 Feb 2023
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking Chang-Shu Liu Yinpeng Dong Wenzhao Xiang X. Yang Hang Su Junyi Zhu YueFeng Chen Yuan He H. Xue Shibao Zheng OOD VLM AAML 17 72 0 28 Feb 2023
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations Ziyu Jiang Yinpeng Chen Mengchen Liu Dongdong Chen Xiyang Dai Lu Yuan Zicheng Liu Zhangyang Wang SSL VLM CLIP 30 16 0 27 Feb 2023
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing Weidong Chen Xiaofen Xing Xiangmin Xu Jianxin Pang Lan Du 30 38 0 27 Feb 2023
Can we avoid Double Descent in Deep Neural Networks? Victor Quétu Enzo Tartaglione AI4CE 20 3 0 26 Feb 2023
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth S. Bhat R. Birkl Diana Wofk Peter Wonka Matthias Müller VLM MDE 48 483 0 23 Feb 2023
Patch Network for medical image Segmentation Weihu Song Heng Yu Jianhua Wu MedIm SSeg 11 0 0 23 Feb 2023
Human MotionFormer: Transferring Human Motions with Vision Transformers Hongyu Liu Xintong Han Chengbin Jin Lihui Qian Huawei Wei ... Faqiang Wang Haoye Dong Yibing Song Jia Xu Qifeng Chen 11 10 0 22 Feb 2023
Connecting Vision and Language with Video Localized Narratives P. Voigtlaender Soravit Changpinyo Jordi Pont-Tuset Radu Soricut V. Ferrari VGen 31 21 0 22 Feb 2023
KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer Kaikai Zhao Norimichi Ukita MU 29 1 0 22 Feb 2023
A residual dense vision transformer for medical image super-resolution with segmentation-based perceptual loss fine-tuning Jin Zhu Guang Yang Pietro Lio' ViT MedIm 24 5 0 22 Feb 2023
Device Tuning for Multi-Task Large Model Penghao Jiang Xuanchen Hou Y. Zhou 11 0 0 21 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring Zhihao Chen Chuang Niu Qi Gao Ge Wang Hongming Shan MedIm ViT 3DV 25 20 0 21 Feb 2023
Soft Error Reliability Analysis of Vision Transformers Xing-xiong Xue Cheng Liu Ying Wang Bing Yang Tao Luo L. Zhang Huawei Li Xiaowei Li 34 14 0 21 Feb 2023
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey Kunlin Wang Zi Wang Zhang Li Ang Su Xichao Teng Minhao Liu Qifeng Yu Qifeng Yu ObjD 81 8 0 21 Feb 2023
Unsupervised Learning on a DIET: Datum IndEx as Target Free of Self-Supervision, Reconstruction, Projector Head Randall Balestriero 38 3 0 20 Feb 2023
STB-VMM: Swin Transformer Based Video Motion Magnification Ricard Lado-Roigé M. A. Pérez 16 13 0 20 Feb 2023
StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation Yining Shi Kun Jiang Ke Wang Jiusi Li Yunlong Wang Mengmeng Yang Diange Yang AI4TS 30 2 0 19 Feb 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification Omid Nejati Manzari Hamid Ahmadabadi Hossein Kashiani S. B. Shokouhi Ahmad Ayatollahi ViT MedIm 21 176 0 19 Feb 2023
Hyneter: Hybrid Network Transformer for Object Detection Dong Chen Duoqian Miao Xuepeng Zhao ViT 27 3 0 18 Feb 2023
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer N. H. Phong B. Ribeiro 27 15 0 17 Feb 2023
CovidExpert: A Triplet Siamese Neural Network framework for the detection of COVID-19 Tareque Rahman Ornob G. Roy Enamul Hassan 19 12 0 17 Feb 2023
Less is More: The Influence of Pruning on the Explainability of CNNs David Weber F. Merkle Pascal Schöttle Stephan Schlögl Martin Nocker FAtt 29 1 0 17 Feb 2023
Efficiency 360: Efficient Vision Transformers Badri N. Patro Vijay Srinivas Agneeswaran 21 6 0 16 Feb 2023
3M3D: Multi-view, Multi-path, Multi-representation for 3D Object Detection Jong Sung Park Apoorv Singh Varun Bankiti 3DPC 23 7 0 16 Feb 2023
Hierarchical Cross-modal Transformer for RGB-D Salient Object Detection Hao Chen Feihong Shen ViT 29 0 0 16 Feb 2023
Offline-to-Online Knowledge Distillation for Video Instance Segmentation H. Kim Seunghun Lee Sunghoon Im OffRL 36 3 0 15 Feb 2023
From paintbrush to pixel: A review of deep neural networks in AI-generated art Anne-Sofie Maerten Derya Soydaner 30 22 0 14 Feb 2023
Multi-Source Contrastive Learning from Musical Audio C. Garoufis Athanasia Zlatintsi Petros Maragos 19 6 0 14 Feb 2023
Semantic Image Segmentation: Two Decades of Research G. Csurka Riccardo Volpi Boris Chidlovskii 3DV 24 49 0 13 Feb 2023
Fixing Overconfidence in Dynamic Neural Networks Lassi Meronen Martin Trapp Andrea Pilzer Le Yang Arno Solin BDL 21 16 0 13 Feb 2023
Semantic Feature Integration network for Fine-grained Visual Classification Haibo Wang Yueyang Li Haichi Luo 30 0 0 13 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity Hongkang Li M. Wang Sijia Liu Pin-Yu Chen ViT MLT 35 56 0 12 Feb 2023
Flexible-modal Deception Detection with Audio-Visual Adapter Zhaoxu Li Zitong Yu Nithish Muthuchamy Selvaraj Xiaobao Guo Bingquan Shen A. Kong Alex C. Kot 22 2 0 11 Feb 2023
Key Design Choices for Double-Transfer in Source-Free Unsupervised Domain Adaptation Andrea Maracani Raffaello Camoriano Elisa Maiettini Davide Talon Lorenzo Rosasco Lorenzo Natale 21 2 0 10 Feb 2023
GCNet: Probing Self-Similarity Learning for Generalized Counting Network Mingjie Wang Yande Li Jun Zhou Graham W. Taylor Minglun Gong 21 11 0 10 Feb 2023
Making Substitute Models More Bayesian Can Enhance Transferability of Adversarial Examples Qizhang Li Yiwen Guo W. Zuo Hao Chen AAML 19 35 0 10 Feb 2023
Efficient Attention via Control Variates Lin Zheng Jianbo Yuan Chong-Jun Wang Lingpeng Kong 24 18 0 09 Feb 2023
Towards Geospatial Foundation Models via Continual Pretraining Matías Mendieta Boran Han Xingjian Shi Yi Zhu Chen Chen VLM AI4CE 38 63 0 09 Feb 2023