Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (227★)
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 860 papers shown
PVT v2: Improved Baselines with Pyramid Vision Transformer
Computational Visual Media (CVM), 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
791
2,143
0
25 Jun 2021
ViTAS: Vision Transformer Architecture Search
European Conference on Computer Vision (ECCV), 2021
Xiu Su
Shan You
Jiyang Xie
Mingkai Zheng
Haiwei Yang
Chao Qian
Changshui Zhang
Xiaogang Wang
Chang Xu
ViT
457
56
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
424
378
0
24 Jun 2021
IA-RED
2
^2
2
: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Yikang Shen
Lezhi Li
Zinan Lin
Rogerio Feris
A. Oliva
VLM
ViT
329
191
0
23 Jun 2021
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
Qibin Hou
Zihang Jiang
Li-xin Yuan
Mingg-Ming Cheng
Shuicheng Yan
Jiashi Feng
ViT
MLLM
306
236
0
23 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
609
289
0
22 Jun 2021
Encoder-Decoder Architectures for Clinically Relevant Coronary Artery Segmentation
International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), 2021
Joao Lourencco Silva
M. Menezes
T. Rodrigues
B. Silva
F. Pinto
Arlindo L. Oliveira
MedIm
216
22
0
21 Jun 2021
More than Encoder: Introducing Transformer Decoder to Upsample
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2021
Yijiang Li
Wentian Cai
Ying Gao
Chengming Li
Xiping Hu
ViT
MedIm
254
75
0
20 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
345
776
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
International Conference on Learning Representations (ICLR), 2021
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
303
224
0
17 Jun 2021
S
2
^2
2
-MLP: Spatial-Shift MLP Architecture for Vision
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
261
219
0
14 Jun 2021
Styleformer: Transformer based Generative Adversarial Networks with Style Vector
Computer Vision and Pattern Recognition (CVPR), 2021
Jeeseung Park
Younggeun Kim
ViT
314
59
0
13 Jun 2021
MlTr: Multi-label Classification with Transformer
IEEE International Conference on Multimedia and Expo (ICME), 2021
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Nian Shi
Honglin Liu
ViT
176
58
0
11 Jun 2021
Transformed CNNs: recasting pre-trained convolutional layers with self-attention
Stéphane dÁscoli
Levent Sagun
Giulio Biroli
Ari S. Morcos
ViT
106
7
0
10 Jun 2021
CAT: Cross Attention in Vision Transformer
IEEE International Conference on Multimedia and Expo (ICME), 2021
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
187
260
0
10 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Neural Information Processing Systems (NeurIPS), 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
578
1,478
0
09 Jun 2021
TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder Dilation network for Low-dose CT Denoising
Dayang Wang
Zhan Wu
Hengyong Yu
ViT
MedIm
211
66
0
08 Jun 2021
On the Connection between Local Attention and Dynamic Depth-wise Convolution
International Conference on Learning Representations (ICLR), 2021
Qi Han
Zejia Fan
Jingdong Sun
Lei-huan Sun
Ming-Ming Cheng
Jiaying Liu
Jingdong Wang
ViT
366
133
0
08 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
International Conference on Learning Representations (ICLR), 2021
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Fahad Shahbaz Khan
Fatih Porikli
ViT
262
107
0
08 Jun 2021
Fully Transformer Networks for Semantic Image Segmentation
Sitong Wu
Tianyi Wu
Fangjian Lin
Sheng Tian
Guodong Guo
ViT
289
47
0
08 Jun 2021
Efficient Training of Visual Transformers with Small Datasets
Neural Information Processing Systems (NeurIPS), 2021
Yahui Liu
E. Sangineto
Wei Bi
Andrii Zadaianchuk
Bruno Lepri
Marco De Nadai
ViT
194
215
0
07 Jun 2021
Reveal of Vision Transformers Robustness against Adversarial Attacks
Ahmed Aldahdooh
W. Hamidouche
Olivier Déforges
ViT
247
68
0
07 Jun 2021
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang
Youcheng Ben
Guozhong Luo
Pei Cheng
Gang Yu
Bin-Bin Fu
ViT
276
208
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Neural Information Processing Systems (NeurIPS), 2021
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
443
396
0
07 Jun 2021
Vision Transformers with Hierarchical Attention
Machine Intelligence Research (MIR), 2021
Yun-Hai Liu
Yu-Huan Wu
Guolei Sun
Le Zhang
Ajad Chhatkuli
Luc Van Gool
ViT
184
72
0
06 Jun 2021
CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings
Neural Information Processing Systems (NeurIPS), 2021
Tatiana Likhomanenko
Qiantong Xu
Gabriel Synnaeve
R. Collobert
A. Rogozhnikov
OOD
ViT
343
70
0
06 Jun 2021
Uformer: A General U-Shaped Transformer for Image Restoration
Computer Vision and Pattern Recognition (CVPR), 2021
Zhendong Wang
Xiaodong Cun
Jianmin Bao
Wengang Zhou
Jianzhuang Liu
Houqiang Li
ViT
509
1,912
0
06 Jun 2021
RegionViT: Regional-to-Local Attention for Vision Transformers
International Conference on Learning Representations (ICLR), 2021
Chun-Fu Chen
Yikang Shen
Quanfu Fan
ViT
478
234
0
04 Jun 2021
Glance-and-Gaze Vision Transformer
Neural Information Processing Systems (NeurIPS), 2021
Qihang Yu
Yingda Xia
Yutong Bai
Yongyi Lu
Alan Yuille
Wei Shen
ViT
162
83
0
04 Jun 2021
X-volution: On the unification of convolution and self-attention
Xuanhong Chen
Hang Wang
Bingbing Ni
ViT
154
27
0
04 Jun 2021
Attention mechanisms and deep learning for machine vision: A survey of the state of the art
A. M. Hafiz
S. A. Parah
R. A. Bhat
228
56
0
03 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Neural Information Processing Systems (NeurIPS), 2021
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
1.2K
7,116
0
31 May 2021
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens
Computer Vision and Pattern Recognition (CVPR), 2021
Jiemin Fang
Lingxi Xie
Xinggang Wang
Xiaopeng Zhang
Wenyu Liu
Qi Tian
ViT
231
84
0
31 May 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Neural Information Processing Systems (NeurIPS), 2021
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
288
26
0
31 May 2021
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition
Neural Information Processing Systems (NeurIPS), 2021
Yulin Wang
Rui Huang
Qing Xiao
Zeyi Huang
Gao Huang
ViT
283
234
0
31 May 2021
Dual-stream Network for Visual Recognition
Neural Information Processing Systems (NeurIPS), 2021
Mingyuan Mao
Renrui Zhang
Honghui Zheng
Shiyang Feng
Teli Ma
Yan Peng
Errui Ding
Baochang Zhang
Shumin Han
ViT
282
78
0
31 May 2021
Less is More: Pay Less Attention in Vision Transformers
AAAI Conference on Artificial Intelligence (AAAI), 2021
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
341
102
0
29 May 2021
KVT: k-NN Attention for Boosting Vision Transformers
European Conference on Computer Vision (ECCV), 2021
Pichao Wang
Qingsong Wen
F. Wang
Ming Lin
Shuning Chang
Hao Li
Rong Jin
ViT
260
130
0
28 May 2021
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding
AAAI Conference on Artificial Intelligence (AAAI), 2021
Zizhao Zhang
Han Zhang
Long Zhao
Ting Chen
Sercan O. Arik
Tomas Pfister
ViT
357
207
0
26 May 2021
Pay Attention to MLPs
Neural Information Processing Systems (NeurIPS), 2021
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
AI4CE
622
807
0
17 May 2021
Towards Robust Vision Transformer
Computer Vision and Pattern Recognition (CVPR), 2021
Xiaofeng Mao
Gege Qi
YueFeng Chen
Xiaodan Li
Ranjie Duan
Shaokai Ye
Yuan He
Hui Xue
ViT
466
233
0
17 May 2021
Waste detection in Pomerania: non-profit project for detecting waste in environment
Waste Management (Waste Manag.), 2021
Sylwia Majchrowska
Agnieszka Mikołajczyk
M. Ferlin
Zuzanna Klawikowska
Marta A. Plantykow
Arkadiusz Kwasigroch
K. Majek
258
167
0
12 May 2021
Homogeneous vector bundles and
G
G
G
-equivariant convolutional neural networks
Sampling Theory, Signal Processing, and Data Analysis (SAMPTA), 2021
J. Aronsson
211
27
0
12 May 2021
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
Luke Melas-Kyriazi
ViT
122
116
0
06 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
232
640
0
05 May 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Neural Information Processing Systems (NeurIPS), 2021
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
665
1,226
0
28 Apr 2021
Vision Transformers with Patch Diversification
Chengyue Gong
Dilin Wang
Meng Li
Vikas Chandra
Qiang Liu
ViT
257
68
0
26 Apr 2021
Visformer: The Vision-friendly Transformer
IEEE International Conference on Computer Vision (ICCV), 2021
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
530
275
0
26 Apr 2021
VidTr: Video Transformer Without Convolutions
IEEE International Conference on Computer Vision (ICCV), 2021
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
429
220
0
23 Apr 2021
All Tokens Matter: Token Labeling for Training Better Vision Transformers
Neural Information Processing Systems (NeurIPS), 2021
Zihang Jiang
Qibin Hou
Li-xin Yuan
Daquan Zhou
Yujun Shi
Xiaojie Jin
Anran Wang
Jiashi Feng
ViT
403
237
0
22 Apr 2021
Previous
1
2
3
...
16
17
18
Next
Page 17 of 18
Page
of 18
Go