ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.14030
  4. Cited By
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

25 March 2021
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
    ViT
ArXivPDFHTML

Papers citing "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

50 / 2,127 papers shown
Title
Fashionformer: A simple, Effective and Unified Baseline for Human
  Fashion Segmentation and Recognition
Fashionformer: A simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition
Shilin Xu
Xiangtai Li
Jingbo Wang
Guangliang Cheng
Yunhai Tong
Dacheng Tao
ViT
19
27
0
10 Apr 2022
Multimodal Transformer for Nursing Activity Recognition
Multimodal Transformer for Nursing Activity Recognition
Momal Ijaz
Renato Diaz
C. L. P. Chen
ViT
22
26
0
09 Apr 2022
Does Robustness on ImageNet Transfer to Downstream Tasks?
Does Robustness on ImageNet Transfer to Downstream Tasks?
Yutaro Yamada
Mayu Otani
OOD
16
27
0
08 Apr 2022
Unsupervised Prompt Learning for Vision-Language Models
Unsupervised Prompt Learning for Vision-Language Models
Hao Huang
Jack Chu
Fangyun Wei
VPVLM
MLLM
VLM
31
131
0
07 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
30
240
0
07 Apr 2022
The Effects of Regularization and Data Augmentation are Class Dependent
The Effects of Regularization and Data Augmentation are Class Dependent
Randall Balestriero
Léon Bottou
Yann LeCun
28
94
0
07 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top
  Results
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
31
11
0
07 Apr 2022
Event Transformer. A sparse-aware solution for efficient event data
  processing
Event Transformer. A sparse-aware solution for efficient event data processing
Alberto Sabater
Luis Montesano
Ana C. Murillo
19
51
0
07 Apr 2022
Learning Local and Global Temporal Contexts for Video Semantic
  Segmentation
Learning Local and Global Temporal Contexts for Video Semantic Segmentation
Guolei Sun
Yun Liu
Henghui Ding
Min Wu
Luc Van Gool
25
32
0
07 Apr 2022
Multi-scale Context-aware Network with Transformer for Gait Recognition
Multi-scale Context-aware Network with Transformer for Gait Recognition
Duo-Lin Zhu
Xiaohui Huang
Xinggang Wang
Bo Yang
Botao He
Wenyu Liu
Bin Feng
ViT
CVBM
14
15
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for
  Object Detection
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
16
55
0
06 Apr 2022
An Empirical Study of Remote Sensing Pretraining
An Empirical Study of Remote Sensing Pretraining
Di Wang
Jing Zhang
Bo Du
Guisong Xia
Dacheng Tao
EDL
23
190
0
06 Apr 2022
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Z. Li
Cheng Lu
Jia Qin
Chunle Guo
Mingg-Ming Cheng
41
149
0
06 Apr 2022
Style-Hallucinated Dual Consistency Learning for Domain Generalized
  Semantic Segmentation
Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation
Yuyang Zhao
Zhun Zhong
Na Zhao
N. Sebe
G. Lee
30
99
0
06 Apr 2022
Modeling Motion with Multi-Modal Features for Text-Based Video
  Segmentation
Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
Wangbo Zhao
Kai Wang
Xiangxiang Chu
Fuzhao Xue
Xinchao Wang
Yang You
29
21
0
06 Apr 2022
OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses
OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses
Robik Shrestha
Kushal Kafle
Christopher Kanan
CML
21
13
0
05 Apr 2022
Vision Transformer Equipped with Neural Resizer on Facial Expression
  Recognition Task
Vision Transformer Equipped with Neural Resizer on Facial Expression Recognition Task
Hyeonbin Hwang
Soyeon Kim
Wei-Jin Park
Jiho Seo
Kyungtae Ko
Hyeon Yeo
ViT
26
9
0
05 Apr 2022
Birds of A Feather Flock Together: Category-Divergence Guidance for
  Domain Adaptive Segmentation
Birds of A Feather Flock Together: Category-Divergence Guidance for Domain Adaptive Segmentation
Bo Yuan
Danpei Zhao
Shuai Shao
Zehuan Yuan
Changhu Wang
55
14
0
05 Apr 2022
Region Rebalance for Long-Tailed Semantic Segmentation
Region Rebalance for Long-Tailed Semantic Segmentation
Jiequan Cui
Yuhui Yuan
Zhisheng Zhong
Zhuotao Tian
Han Hu
Stephen Lin
Jiaya Jia
10
18
0
05 Apr 2022
Autoregressive 3D Shape Generation via Canonical Mapping
Autoregressive 3D Shape Generation via Canonical Mapping
A. Cheng
Xueting Li
Sifei Liu
Min Sun
Ming Yang
3DPC
37
39
0
05 Apr 2022
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric
  Videos
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos
Shao-Wei Liu
Subarna Tripathi
Somdeb Majumdar
Xiaolong Wang
EgoV
20
93
0
04 Apr 2022
Long Movie Clip Classification with State-Space Video Models
Long Movie Clip Classification with State-Space Video Models
Md. Mohaiminul Islam
Gedas Bertasius
VLM
36
101
0
04 Apr 2022
MultiMAE: Multi-modal Multi-task Masked Autoencoders
MultiMAE: Multi-modal Multi-task Masked Autoencoders
Roman Bachmann
David Mizrahi
Andrei Atanov
Amir Zamir
22
265
0
04 Apr 2022
BatchFormerV2: Exploring Sample Relationships for Dense Representation
  Learning
BatchFormerV2: Exploring Sample Relationships for Dense Representation Learning
Zhi Hou
Baosheng Yu
Chaoyue Wang
Yibing Zhan
Dacheng Tao
ViT
13
11
0
04 Apr 2022
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He
Jianfei Cai
Zizheng Pan
Jing Liu
Jing Zhang
Dacheng Tao
Bohan Zhuang
34
16
0
04 Apr 2022
TransRAC: Encoding Multi-scale Temporal Correlation with Transformers
  for Repetitive Action Counting
TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Huazhang Hu
Sixun Dong
Yiqun Zhao
Dongze Lian
Zhengxin Li
Shenghua Gao
18
47
0
03 Apr 2022
Improving Vision Transformers by Revisiting High-frequency Components
Improving Vision Transformers by Revisiting High-frequency Components
Jiawang Bai
Liuliang Yuan
Shutao Xia
Shuicheng Yan
Zhifeng Li
W. Liu
ViT
8
90
0
03 Apr 2022
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation
Zhenyu Li
Xuyang Wang
Xianming Liu
Junjun Jiang
MDE
19
191
0
03 Apr 2022
R(Det)^2: Randomized Decision Routing for Object Detection
R(Det)^2: Randomized Decision Routing for Object Detection
Yali Li
Shengjin Wang
ObjD
10
9
0
02 Apr 2022
What to look at and where: Semantic and Spatial Refined Transformer for
  detecting human-object interactions
What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions
A S M Iftekhar
Hao Chen
Kaustav Kundu
Xinyu Li
Joseph Tighe
Davide Modolo
ViT
22
50
0
02 Apr 2022
Data and Physics Driven Learning Models for Fast MRI -- Fundamentals and
  Methodologies from CNN, GAN to Attention and Transformers
Data and Physics Driven Learning Models for Fast MRI -- Fundamentals and Methodologies from CNN, GAN to Attention and Transformers
Jiahao Huang
Yingying Fang
Yang Nan
Huanjun Wu
Yinzhe Wu
...
Zidong Wang
Pietro Lio'
Daniel Rueckert
Yonina C. Eldar
Guang Yang
OOD
MedIm
31
3
0
01 Apr 2022
CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow
CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow
Xiuchao Sui
Shaohua Li
Xue Geng
Yan Wu
Xinxing Xu
Yong Liu
Rick Siow Mong Goh
Hongyuan Zhu
ViT
26
95
0
31 Mar 2022
Rethinking Portrait Matting with Privacy Preserving
Rethinking Portrait Matting with Privacy Preserving
Sihan Ma
Jizhizi Li
Jing Zhang
He-jun Zhang
Dacheng Tao
18
23
0
31 Mar 2022
Deformable Video Transformer
Deformable Video Transformer
Jue Wang
Lorenzo Torresani
ViT
22
28
0
31 Mar 2022
ReSTR: Convolution-free Referring Image Segmentation Using Transformers
ReSTR: Convolution-free Referring Image Segmentation Using Transformers
N. Kim
Dongwon Kim
Cuiling Lan
Wenjun Zeng
Suha Kwak
19
136
0
31 Mar 2022
End-to-end Document Recognition and Understanding with Dessurt
End-to-end Document Recognition and Understanding with Dessurt
Brian L. Davis
B. Morse
Brian L. Price
Chris Tensmeyer
Curtis Wigington
Vlad I. Morariu
VLM
ViT
11
73
0
30 Mar 2022
AdaMixer: A Fast-Converging Query-Based Object Detector
AdaMixer: A Fast-Converging Query-Based Object Detector
Ziteng Gao
Limin Wang
Bing Han
Sheng Guo
ObjD
22
105
0
30 Mar 2022
Concept Evolution in Deep Learning Training: A Unified Interpretation
  Framework and Discoveries
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Haekyu Park
Seongmin Lee
Benjamin Hoover
Austin P. Wright
Omar Shaikh
Rahul Duggal
Nilaksh Das
Kevin Li
Judy Hoffman
Duen Horng Chau
17
2
0
30 Mar 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
28
94
0
30 Mar 2022
PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised
  Object Detection
PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection
Gang Li
Xiang Li
Yujie Wang
Yichao Wu
Ding Liang
Shanshan Zhang
12
91
0
30 Mar 2022
VPTR: Efficient Transformers for Video Prediction
VPTR: Efficient Transformers for Video Prediction
Xi Ye
Guillaume-Alexandre Bilodeau
ViT
19
18
0
29 Mar 2022
SepViT: Separable Vision Transformer
SepViT: Separable Vision Transformer
Wei Li
Xing Wang
Xin Xia
Jie Wu
Jiashi Li
Xuefeng Xiao
Min Zheng
Shiping Wen
ViT
26
39
0
29 Mar 2022
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training
Xiaotong Li
Yixiao Ge
Kun Yi
Zixuan Hu
Ying Shan
Ling-yu Duan
29
38
0
29 Mar 2022
SIOD: Single Instance Annotated Per Category Per Image for Object
  Detection
SIOD: Single Instance Annotated Per Category Per Image for Object Detection
Hanjun Li
Xingjia Pan
Ke Yan
Fan Tang
Weihao Zheng
25
18
0
29 Mar 2022
End-to-End Transformer Based Model for Image Captioning
End-to-End Transformer Based Model for Image Captioning
Yiyu Wang
Jungang Xu
Yingfei Sun
VLM
ViT
18
117
0
29 Mar 2022
Parameter-efficient Model Adaptation for Vision Transformers
Parameter-efficient Model Adaptation for Vision Transformers
Xuehai He
Chunyuan Li
Pengchuan Zhang
Jianwei Yang
X. Wang
20
81
0
29 Mar 2022
Few-Shot Object Detection with Fully Cross-Transformer
Few-Shot Object Detection with Fully Cross-Transformer
G. Han
Jiawei Ma
Shiyuan Huang
Long Chen
Shih-Fu Chang
10
129
0
28 Mar 2022
Automated Progressive Learning for Efficient Training of Vision
  Transformers
Automated Progressive Learning for Efficient Training of Vision Transformers
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
16
46
0
28 Mar 2022
Efficient and Degradation-Adaptive Network for Real-World Image
  Super-Resolution
Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution
Jie Liang
Huiyu Zeng
Lei Zhang
SupR
23
87
0
27 Mar 2022
RSTT: Real-time Spatial Temporal Transformer for Space-Time Video
  Super-Resolution
RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution
Z. Geng
Luming Liang
Tianyu Ding
Ilya Zharkov
17
68
0
27 Mar 2022
Previous
123...353637...414243
Next