Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09883
Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer V2: Scaling Up Capacity and Resolution"
50 / 821 papers shown
Title
Rethinking Attention Mechanism in Time Series Classification
Bowen Zhao
Huanlai Xing
Xinhan Wang
Fuhong Song
Zhiwen Xiao
AI4TS
14
29
0
14 Jul 2022
Pyramid Transformer for Traffic Sign Detection
Omid Nejati Manzari
A. Boudesh
S. B. Shokouhi
ViT
8
12
0
13 Jul 2022
MSP-Former: Multi-Scale Projection Transformer for Single Image Desnowing
Sixiang Chen
Tian-Chun Ye
Yun-Peng Liu
Taodong Liao
Y. Ye
Erkang Chen
Peng Chen
ViT
15
51
0
12 Jul 2022
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Chien-Yao Wang
Alexey Bochkovskiy
H. Liao
ObjD
20
6,062
0
06 Jul 2022
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
9
7
0
05 Jul 2022
Spatiotemporal Feature Learning Based on Two-Step LSTM and Transformer for CT Scans
Chih-Chung Hsu
Chin-Han Tsai
Guangfeng Chen
Sin-Di Ma
Shen-Chieh Tai
MedIm
13
9
0
04 Jul 2022
Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge
Saravanabalagi Ramachandran
Ganesh Sistu
V. Kumar
J. McDonald
S. Yogamani
15
4
0
26 Jun 2022
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
Yukang Chen
Jianhui Liu
X. Zhang
Xiaojuan Qi
Jiaya Jia
31
82
0
21 Jun 2022
HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction
Yi Hu
Wenxin Shao
Bo Jiang
Jiajie Chen
Siqi Chai
Zhening Yang
Jingyu Qian
Helong Zhou
Qiang Liu
AI4CE
17
13
0
21 Jun 2022
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
14
118
0
20 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
28
32
0
19 Jun 2022
Enhanced Bi-directional Motion Estimation for Video Frame Interpolation
Xin Jin
Longhai Wu
Guotao Shen
Youxin Chen
Jie Chen
Jayoon Koo
Cheul-hee Hahm
14
22
0
17 Jun 2022
Rectify ViT Shortcut Learning by Visual Saliency
Chong Ma
Lin Zhao
Yuzhong Chen
David Liu
Xi Jiang
Tuo Zhang
Xintao Hu
Dinggang Shen
Dajiang Zhu
Tianming Liu
ViT
20
20
0
17 Jun 2022
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar
Alaaeldin El-Nouby
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
ViT
19
95
0
16 Jun 2022
ChordMixer: A Scalable Neural Attention Model for Sequences with Different Lengths
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
17
12
0
12 Jun 2022
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
18
49
0
09 Jun 2022
CASS: Cross Architectural Self-Supervision for Medical Image Analysis
Pranav Singh
E. Sizikova
Jacopo Cirrone
OOD
38
8
0
08 Jun 2022
Tutel: Adaptive Mixture-of-Experts at Scale
Changho Hwang
Wei Cui
Yifan Xiong
Ziyue Yang
Ze Liu
...
Joe Chau
Peng Cheng
Fan Yang
Mao Yang
Y. Xiong
MoE
92
107
0
07 Jun 2022
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
Feng Li
Hao Zhang
Hu-Sheng Xu
Siyi Liu
Lei Zhang
L. Ni
H. Shum
ISeg
27
363
0
06 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
8
340
0
02 Jun 2022
KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction
Han Li
Dan Zhao
Jianyang Zeng
22
59
0
02 Jun 2022
Decomposing NeRF for Editing via Feature Field Distillation
Sosuke Kobayashi
Eiichi Matsumoto
Vincent Sitzmann
165
326
0
31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Leandro M. de Lima
R. Krohling
ViT
MedIm
18
9
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
104
17
0
30 May 2022
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
Yixuan Wei
Han Hu
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Jianmin Bao
Dong Chen
B. Guo
CLIP
78
123
0
27 May 2022
How Tempering Fixes Data Augmentation in Bayesian Neural Networks
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
BDL
AAML
72
8
0
27 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
11
68
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
17
53
0
26 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
21
11
0
21 May 2022
DProQ: A Gated-Graph Transformer for Protein Complex Structure Assessment
Xiao Chen
Alex Morehead
Jian Liu
Jianlin Cheng
27
7
0
21 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
95
73
0
20 May 2022
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
20
537
0
17 May 2022
An Effective Transformer-based Solution for RSNA Intracranial Hemorrhage Detection Competition
Fangxin Shang
Siqi Wang
Xiaorong Wang
Yehui Yang
MedIm
13
2
0
16 May 2022
Sequencer: Deep LSTM for Image Classification
Yuki Tatsunami
Masato Taki
VLM
ViT
8
77
0
04 May 2022
Improving the Transferability of Adversarial Examples with Restructure Embedded Patches
Huipeng Zhou
Yu-an Tan
Yajie Wang
Haoran Lyu
Shan-Hung Wu
Yuan-zhang Li
ViT
11
4
0
27 Apr 2022
SUES-200: A Multi-height Multi-scene Cross-view Image Benchmark Across Drone and Satellite
Runzhe Zhu
Ling Yin
Mingze Yang
Fei Wu
Yunchen Yang
Wenbo Hu
12
45
0
22 Apr 2022
Diverse Imagenet Models Transfer Better
Niv Nayman
A. Golbert
Asaf Noy
Tan Ping
Lihi Zelnik-Manor
14
0
0
19 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
22
50
0
18 Apr 2022
ResT V2: Simpler, Faster and Stronger
Qing-Long Zhang
Yubin Yang
ViT
17
24
0
15 Apr 2022
S4OD: Semi-Supervised learning for Single-Stage Object Detection
Yueming Zhang
Xingxu Yao
Chao-Jung Liu
F. Chen
Xiaolin Song
Tengfei Xing
Runbo Hu
Hua Chai
Pengfei Xu
Guoshan Zhang
ObjD
20
7
0
09 Apr 2022
PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model
Juncai Peng
Yi Liu
Shiyu Tang
Yuying Hao
Lutao Chu
...
Baohua Lai
Qiwen Liu
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
SSeg
VLM
11
136
0
06 Apr 2022
Exploring Plain Vision Transformer Backbones for Object Detection
Yanghao Li
Hanzi Mao
Ross B. Girshick
Kaiming He
ViT
11
765
0
30 Mar 2022
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
22
261
0
22 Mar 2022
GroupTransNet: Group Transformer Network for RGB-D Salient Object Detection
Xian Fang
Jin-lei Zhu
Xiuli Shao
Hongpeng Wang
ViT
22
13
0
21 Mar 2022
simCrossTrans: A Simple Cross-Modality Transfer Learning for Object Detection with ConvNets or Vision Transformers
Xiaoke Shen
I. Stamos
ViT
10
5
0
20 Mar 2022
Open Set Recognition using Vision Transformer with an Additional Detection Head
Feiyang Cai
Zhenkai Zhang
Jie Liu
X. Koutsoukos
ViT
14
6
0
16 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
X. Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian-jun Sun
VLM
31
522
0
13 Mar 2022
Active Token Mixer
Guoqiang Wei
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
8
15
0
11 Mar 2022
YouTube-GDD: A challenging gun detection dataset with rich contextual information
Yongxiang Gu
Xingbin Liao
Xiaolin Qin
9
6
0
08 Mar 2022
Dynamic Group Transformer: A General Vision Transformer Backbone with Dynamic Group Attention
Kai Liu
Tianyi Wu
Cong Liu
Guodong Guo
ViT
25
13
0
08 Mar 2022
Previous
1
2
3
...
15
16
17
Next