Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (227★)
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 857 papers shown
Title
IA-RED
2
^2
2
: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Yikang Shen
Lezhi Li
Zinan Lin
Rogerio Feris
A. Oliva
VLM
ViT
271
191
0
23 Jun 2021
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
Qibin Hou
Zihang Jiang
Li-xin Yuan
Mingg-Ming Cheng
Shuicheng Yan
Jiashi Feng
ViT
MLLM
248
230
0
23 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
458
281
0
22 Jun 2021
Encoder-Decoder Architectures for Clinically Relevant Coronary Artery Segmentation
International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), 2021
Joao Lourencco Silva
M. Menezes
T. Rodrigues
B. Silva
F. Pinto
Arlindo L. Oliveira
MedIm
179
22
0
21 Jun 2021
More than Encoder: Introducing Transformer Decoder to Upsample
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2021
Yijiang Li
Wentian Cai
Ying Gao
Chengming Li
Xiping Hu
ViT
MedIm
165
71
0
20 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
315
759
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
International Conference on Learning Representations (ICLR), 2021
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
223
221
0
17 Jun 2021
S
2
^2
2
-MLP: Spatial-Shift MLP Architecture for Vision
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
216
214
0
14 Jun 2021
Styleformer: Transformer based Generative Adversarial Networks with Style Vector
Computer Vision and Pattern Recognition (CVPR), 2021
Jeeseung Park
Younggeun Kim
ViT
256
58
0
13 Jun 2021
MlTr: Multi-label Classification with Transformer
IEEE International Conference on Multimedia and Expo (ICME), 2021
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Nian Shi
Honglin Liu
ViT
119
53
0
11 Jun 2021
Transformed CNNs: recasting pre-trained convolutional layers with self-attention
Stéphane dÁscoli
Levent Sagun
Giulio Biroli
Ari S. Morcos
ViT
70
7
0
10 Jun 2021
CAT: Cross Attention in Vision Transformer
IEEE International Conference on Multimedia and Expo (ICME), 2021
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
148
250
0
10 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Neural Information Processing Systems (NeurIPS), 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
382
1,433
0
09 Jun 2021
TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder Dilation network for Low-dose CT Denoising
Dayang Wang
Zhan Wu
Hengyong Yu
ViT
MedIm
175
64
0
08 Jun 2021
On the Connection between Local Attention and Dynamic Depth-wise Convolution
International Conference on Learning Representations (ICLR), 2021
Qi Han
Zejia Fan
Jingdong Sun
Lei-huan Sun
Ming-Ming Cheng
Jiaying Liu
Jingdong Wang
ViT
272
131
0
08 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
International Conference on Learning Representations (ICLR), 2021
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Fahad Shahbaz Khan
Fatih Porikli
ViT
206
106
0
08 Jun 2021
Fully Transformer Networks for Semantic Image Segmentation
Sitong Wu
Tianyi Wu
Fangjian Lin
Sheng Tian
Guodong Guo
ViT
224
47
0
08 Jun 2021
Efficient Training of Visual Transformers with Small Datasets
Neural Information Processing Systems (NeurIPS), 2021
Yahui Liu
E. Sangineto
Wei Bi
Andrii Zadaianchuk
Bruno Lepri
Marco De Nadai
ViT
148
211
0
07 Jun 2021
Reveal of Vision Transformers Robustness against Adversarial Attacks
Ahmed Aldahdooh
W. Hamidouche
Olivier Déforges
ViT
177
67
0
07 Jun 2021
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang
Youcheng Ben
Guozhong Luo
Pei Cheng
Gang Yu
Bin-Bin Fu
ViT
221
207
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Neural Information Processing Systems (NeurIPS), 2021
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
329
386
0
07 Jun 2021
Vision Transformers with Hierarchical Attention
Machine Intelligence Research (MIR), 2021
Yun-Hai Liu
Yu-Huan Wu
Guolei Sun
Le Zhang
Ajad Chhatkuli
Luc Van Gool
ViT
159
65
0
06 Jun 2021
CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings
Neural Information Processing Systems (NeurIPS), 2021
Tatiana Likhomanenko
Qiantong Xu
Gabriel Synnaeve
R. Collobert
A. Rogozhnikov
OOD
ViT
266
68
0
06 Jun 2021
Uformer: A General U-Shaped Transformer for Image Restoration
Computer Vision and Pattern Recognition (CVPR), 2021
Zhendong Wang
Xiaodong Cun
Jianmin Bao
Wengang Zhou
Jianzhuang Liu
Houqiang Li
ViT
447
1,814
0
06 Jun 2021
RegionViT: Regional-to-Local Attention for Vision Transformers
International Conference on Learning Representations (ICLR), 2021
Chun-Fu Chen
Yikang Shen
Quanfu Fan
ViT
366
224
0
04 Jun 2021
Glance-and-Gaze Vision Transformer
Neural Information Processing Systems (NeurIPS), 2021
Qihang Yu
Yingda Xia
Yutong Bai
Yongyi Lu
Alan Yuille
Wei Shen
ViT
148
82
0
04 Jun 2021
X-volution: On the unification of convolution and self-attention
Xuanhong Chen
Hang Wang
Bingbing Ni
ViT
100
27
0
04 Jun 2021
Attention mechanisms and deep learning for machine vision: A survey of the state of the art
A. M. Hafiz
S. A. Parah
R. A. Bhat
211
51
0
03 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Neural Information Processing Systems (NeurIPS), 2021
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
1.1K
6,783
0
31 May 2021
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens
Computer Vision and Pattern Recognition (CVPR), 2021
Jiemin Fang
Lingxi Xie
Xinggang Wang
Xiaopeng Zhang
Wenyu Liu
Qi Tian
ViT
178
84
0
31 May 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Neural Information Processing Systems (NeurIPS), 2021
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
249
25
0
31 May 2021
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition
Neural Information Processing Systems (NeurIPS), 2021
Yulin Wang
Rui Huang
Qing Xiao
Zeyi Huang
Gao Huang
ViT
242
230
0
31 May 2021
Dual-stream Network for Visual Recognition
Neural Information Processing Systems (NeurIPS), 2021
Mingyuan Mao
Renrui Zhang
Honghui Zheng
Shiyang Feng
Teli Ma
Yan Peng
Errui Ding
Baochang Zhang
Shumin Han
ViT
187
77
0
31 May 2021
Less is More: Pay Less Attention in Vision Transformers
AAAI Conference on Artificial Intelligence (AAAI), 2021
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
277
101
0
29 May 2021
KVT: k-NN Attention for Boosting Vision Transformers
European Conference on Computer Vision (ECCV), 2021
Pichao Wang
Qingsong Wen
F. Wang
Ming Lin
Shuning Chang
Hao Li
Rong Jin
ViT
225
128
0
28 May 2021
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding
AAAI Conference on Artificial Intelligence (AAAI), 2021
Zizhao Zhang
Han Zhang
Long Zhao
Ting Chen
Sercan O. Arik
Tomas Pfister
ViT
257
201
0
26 May 2021
Pay Attention to MLPs
Neural Information Processing Systems (NeurIPS), 2021
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
AI4CE
484
781
0
17 May 2021
Towards Robust Vision Transformer
Computer Vision and Pattern Recognition (CVPR), 2021
Xiaofeng Mao
Gege Qi
YueFeng Chen
Xiaodan Li
Ranjie Duan
Shaokai Ye
Yuan He
Hui Xue
ViT
362
224
0
17 May 2021
Waste detection in Pomerania: non-profit project for detecting waste in environment
Waste Management (Waste Manag.), 2021
Sylwia Majchrowska
Agnieszka Mikołajczyk
M. Ferlin
Zuzanna Klawikowska
Marta A. Plantykow
Arkadiusz Kwasigroch
K. Majek
247
158
0
12 May 2021
Homogeneous vector bundles and
G
G
G
-equivariant convolutional neural networks
Sampling Theory, Signal Processing, and Data Analysis (SAMPTA), 2021
J. Aronsson
183
26
0
12 May 2021
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
Luke Melas-Kyriazi
ViT
115
115
0
06 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
165
616
0
05 May 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Neural Information Processing Systems (NeurIPS), 2021
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
509
1,199
0
28 Apr 2021
Vision Transformers with Patch Diversification
Chengyue Gong
Dilin Wang
Meng Li
Vikas Chandra
Qiang Liu
ViT
209
67
0
26 Apr 2021
Visformer: The Vision-friendly Transformer
IEEE International Conference on Computer Vision (ICCV), 2021
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
459
266
0
26 Apr 2021
VidTr: Video Transformer Without Convolutions
IEEE International Conference on Computer Vision (ICCV), 2021
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
369
215
0
23 Apr 2021
All Tokens Matter: Token Labeling for Training Better Vision Transformers
Neural Information Processing Systems (NeurIPS), 2021
Zihang Jiang
Qibin Hou
Li-xin Yuan
Daquan Zhou
Yujun Shi
Xiaojie Jin
Anran Wang
Jiashi Feng
ViT
325
236
0
22 Apr 2021
Escaping the Big Data Paradigm with Compact Transformers
Ali Hassani
Steven Walton
Nikhil Shah
Abulikemu Abuduweili
Jiachen Li
Humphrey Shi
473
533
0
12 Apr 2021
HindSight: A Graph-Based Vision Model Architecture For Representing Part-Whole Hierarchies
Muhammad AbdurRafae
OCL
46
1
0
08 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
IEEE International Conference on Computer Vision (ICCV), 2021
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Edouard Grave
Matthijs Douze
ViT
282
932
0
02 Apr 2021
Previous
1
2
3
...
16
17
18
Next