ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.15808
  4. Cited By
CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers

IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
    ViT
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (227★)

Papers citing "CvT: Introducing Convolutions to Vision Transformers"

50 / 857 papers shown
Title
IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision
  Transformers
IA-RED2^22: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Yikang Shen
Lezhi Li
Zinan Lin
Rogerio Feris
A. Oliva
VLMViT
271
191
0
23 Jun 2021
Vision Permutator: A Permutable MLP-Like Architecture for Visual
  Recognition
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
Qibin Hou
Zihang Jiang
Li-xin Yuan
Mingg-Ming Cheng
Shuicheng Yan
Jiashi Feng
ViTMLLM
248
230
0
23 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
P2T: Pyramid Pooling Transformer for Scene UnderstandingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
458
281
0
22 Jun 2021
Encoder-Decoder Architectures for Clinically Relevant Coronary Artery
  Segmentation
Encoder-Decoder Architectures for Clinically Relevant Coronary Artery SegmentationInternational Conference on Computational Advances in Bio and Medical Sciences (ICCABS), 2021
Joao Lourencco Silva
M. Menezes
T. Rodrigues
B. Silva
F. Pinto
Arlindo L. Oliveira
MedIm
179
22
0
21 Jun 2021
More than Encoder: Introducing Transformer Decoder to Upsample
More than Encoder: Introducing Transformer Decoder to UpsampleIEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2021
Yijiang Li
Wentian Cai
Ying Gao
Chengming Li
Xiping Hu
ViTMedIm
165
71
0
20 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision
  Transformers
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
315
759
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation
  Learning
Efficient Self-supervised Vision Transformers for Representation LearningInternational Conference on Learning Representations (ICLR), 2021
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
223
221
0
17 Jun 2021
S$^2$-MLP: Spatial-Shift MLP Architecture for Vision
S2^22-MLP: Spatial-Shift MLP Architecture for VisionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
216
214
0
14 Jun 2021
Styleformer: Transformer based Generative Adversarial Networks with
  Style Vector
Styleformer: Transformer based Generative Adversarial Networks with Style VectorComputer Vision and Pattern Recognition (CVPR), 2021
Jeeseung Park
Younggeun Kim
ViT
256
58
0
13 Jun 2021
MlTr: Multi-label Classification with Transformer
MlTr: Multi-label Classification with TransformerIEEE International Conference on Multimedia and Expo (ICME), 2021
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Nian Shi
Honglin Liu
ViT
119
53
0
11 Jun 2021
Transformed CNNs: recasting pre-trained convolutional layers with
  self-attention
Transformed CNNs: recasting pre-trained convolutional layers with self-attention
Stéphane dÁscoli
Levent Sagun
Giulio Biroli
Ari S. Morcos
ViT
70
7
0
10 Jun 2021
CAT: Cross Attention in Vision Transformer
CAT: Cross Attention in Vision TransformerIEEE International Conference on Multimedia and Expo (ICME), 2021
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
148
250
0
10 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
CoAtNet: Marrying Convolution and Attention for All Data SizesNeural Information Processing Systems (NeurIPS), 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
382
1,433
0
09 Jun 2021
TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder
  Dilation network for Low-dose CT Denoising
TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder Dilation network for Low-dose CT Denoising
Dayang Wang
Zhan Wu
Hengyong Yu
ViTMedIm
175
64
0
08 Jun 2021
On the Connection between Local Attention and Dynamic Depth-wise
  Convolution
On the Connection between Local Attention and Dynamic Depth-wise ConvolutionInternational Conference on Learning Representations (ICLR), 2021
Qi Han
Zejia Fan
Jingdong Sun
Lei-huan Sun
Ming-Ming Cheng
Jiaying Liu
Jingdong Wang
ViT
272
131
0
08 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
On Improving Adversarial Transferability of Vision TransformersInternational Conference on Learning Representations (ICLR), 2021
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Fahad Shahbaz Khan
Fatih Porikli
ViT
206
106
0
08 Jun 2021
Fully Transformer Networks for Semantic Image Segmentation
Fully Transformer Networks for Semantic Image Segmentation
Sitong Wu
Tianyi Wu
Fangjian Lin
Sheng Tian
Guodong Guo
ViT
224
47
0
08 Jun 2021
Efficient Training of Visual Transformers with Small Datasets
Efficient Training of Visual Transformers with Small DatasetsNeural Information Processing Systems (NeurIPS), 2021
Yahui Liu
E. Sangineto
Wei Bi
Andrii Zadaianchuk
Bruno Lepri
Marco De Nadai
ViT
148
211
0
07 Jun 2021
Reveal of Vision Transformers Robustness against Adversarial Attacks
Reveal of Vision Transformers Robustness against Adversarial Attacks
Ahmed Aldahdooh
W. Hamidouche
Olivier Déforges
ViT
177
67
0
07 Jun 2021
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang
Youcheng Ben
Guozhong Luo
Pei Cheng
Gang Yu
Bin-Bin Fu
ViT
221
207
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive BiasNeural Information Processing Systems (NeurIPS), 2021
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
329
386
0
07 Jun 2021
Vision Transformers with Hierarchical Attention
Vision Transformers with Hierarchical AttentionMachine Intelligence Research (MIR), 2021
Yun-Hai Liu
Yu-Huan Wu
Guolei Sun
Le Zhang
Ajad Chhatkuli
Luc Van Gool
ViT
159
65
0
06 Jun 2021
CAPE: Encoding Relative Positions with Continuous Augmented Positional
  Embeddings
CAPE: Encoding Relative Positions with Continuous Augmented Positional EmbeddingsNeural Information Processing Systems (NeurIPS), 2021
Tatiana Likhomanenko
Qiantong Xu
Gabriel Synnaeve
R. Collobert
A. Rogozhnikov
OODViT
266
68
0
06 Jun 2021
Uformer: A General U-Shaped Transformer for Image Restoration
Uformer: A General U-Shaped Transformer for Image RestorationComputer Vision and Pattern Recognition (CVPR), 2021
Zhendong Wang
Xiaodong Cun
Jianmin Bao
Wengang Zhou
Jianzhuang Liu
Houqiang Li
ViT
447
1,814
0
06 Jun 2021
RegionViT: Regional-to-Local Attention for Vision Transformers
RegionViT: Regional-to-Local Attention for Vision TransformersInternational Conference on Learning Representations (ICLR), 2021
Chun-Fu Chen
Yikang Shen
Quanfu Fan
ViT
366
224
0
04 Jun 2021
Glance-and-Gaze Vision Transformer
Glance-and-Gaze Vision TransformerNeural Information Processing Systems (NeurIPS), 2021
Qihang Yu
Yingda Xia
Yutong Bai
Yongyi Lu
Alan Yuille
Wei Shen
ViT
148
82
0
04 Jun 2021
X-volution: On the unification of convolution and self-attention
X-volution: On the unification of convolution and self-attention
Xuanhong Chen
Hang Wang
Bingbing Ni
ViT
100
27
0
04 Jun 2021
Attention mechanisms and deep learning for machine vision: A survey of
  the state of the art
Attention mechanisms and deep learning for machine vision: A survey of the state of the art
A. M. Hafiz
S. A. Parah
R. A. Bhat
211
51
0
03 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with TransformersNeural Information Processing Systems (NeurIPS), 2021
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
1.1K
6,783
0
31 May 2021
MSG-Transformer: Exchanging Local Spatial Information by Manipulating
  Messenger Tokens
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger TokensComputer Vision and Pattern Recognition (CVPR), 2021
Jiemin Fang
Lingxi Xie
Xinggang Wang
Xiaopeng Zhang
Wenyu Liu
Qi Tian
ViT
178
84
0
31 May 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Analogous to Evolutionary Algorithm: Designing a Unified Sequence ModelNeural Information Processing Systems (NeurIPS), 2021
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
249
25
0
31 May 2021
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient
  Image Recognition
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image RecognitionNeural Information Processing Systems (NeurIPS), 2021
Yulin Wang
Rui Huang
Qing Xiao
Zeyi Huang
Gao Huang
ViT
242
230
0
31 May 2021
Dual-stream Network for Visual Recognition
Dual-stream Network for Visual RecognitionNeural Information Processing Systems (NeurIPS), 2021
Mingyuan Mao
Renrui Zhang
Honghui Zheng
Shiyang Feng
Teli Ma
Yan Peng
Errui Ding
Baochang Zhang
Shumin Han
ViT
187
77
0
31 May 2021
Less is More: Pay Less Attention in Vision Transformers
Less is More: Pay Less Attention in Vision TransformersAAAI Conference on Artificial Intelligence (AAAI), 2021
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
277
101
0
29 May 2021
KVT: k-NN Attention for Boosting Vision Transformers
KVT: k-NN Attention for Boosting Vision TransformersEuropean Conference on Computer Vision (ECCV), 2021
Pichao Wang
Qingsong Wen
F. Wang
Ming Lin
Shuning Chang
Hao Li
Rong Jin
ViT
225
128
0
28 May 2021
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and
  Interpretable Visual Understanding
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual UnderstandingAAAI Conference on Artificial Intelligence (AAAI), 2021
Zizhao Zhang
Han Zhang
Long Zhao
Ting Chen
Sercan O. Arik
Tomas Pfister
ViT
257
201
0
26 May 2021
Pay Attention to MLPs
Pay Attention to MLPsNeural Information Processing Systems (NeurIPS), 2021
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
AI4CE
484
781
0
17 May 2021
Towards Robust Vision Transformer
Towards Robust Vision TransformerComputer Vision and Pattern Recognition (CVPR), 2021
Xiaofeng Mao
Gege Qi
YueFeng Chen
Xiaodan Li
Ranjie Duan
Shaokai Ye
Yuan He
Hui Xue
ViT
362
224
0
17 May 2021
Waste detection in Pomerania: non-profit project for detecting waste in
  environment
Waste detection in Pomerania: non-profit project for detecting waste in environmentWaste Management (Waste Manag.), 2021
Sylwia Majchrowska
Agnieszka Mikołajczyk
M. Ferlin
Zuzanna Klawikowska
Marta A. Plantykow
Arkadiusz Kwasigroch
K. Majek
247
158
0
12 May 2021
Homogeneous vector bundles and $G$-equivariant convolutional neural
  networks
Homogeneous vector bundles and GGG-equivariant convolutional neural networksSampling Theory, Signal Processing, and Data Analysis (SAMPTA), 2021
J. Aronsson
183
26
0
12 May 2021
Do You Even Need Attention? A Stack of Feed-Forward Layers Does
  Surprisingly Well on ImageNet
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
Luke Melas-Kyriazi
ViT
115
115
0
06 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for
  Visual Tasks
Beyond Self-attention: External Attention using Two Linear Layers for Visual TasksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
165
616
0
05 May 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Twins: Revisiting the Design of Spatial Attention in Vision TransformersNeural Information Processing Systems (NeurIPS), 2021
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
509
1,199
0
28 Apr 2021
Vision Transformers with Patch Diversification
Vision Transformers with Patch Diversification
Chengyue Gong
Dilin Wang
Meng Li
Vikas Chandra
Qiang Liu
ViT
209
67
0
26 Apr 2021
Visformer: The Vision-friendly Transformer
Visformer: The Vision-friendly TransformerIEEE International Conference on Computer Vision (ICCV), 2021
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
459
266
0
26 Apr 2021
VidTr: Video Transformer Without Convolutions
VidTr: Video Transformer Without ConvolutionsIEEE International Conference on Computer Vision (ICCV), 2021
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
369
215
0
23 Apr 2021
All Tokens Matter: Token Labeling for Training Better Vision
  Transformers
All Tokens Matter: Token Labeling for Training Better Vision TransformersNeural Information Processing Systems (NeurIPS), 2021
Zihang Jiang
Qibin Hou
Li-xin Yuan
Daquan Zhou
Yujun Shi
Xiaojie Jin
Anran Wang
Jiashi Feng
ViT
325
236
0
22 Apr 2021
Escaping the Big Data Paradigm with Compact Transformers
Escaping the Big Data Paradigm with Compact Transformers
Ali Hassani
Steven Walton
Nikhil Shah
Abulikemu Abuduweili
Jiachen Li
Humphrey Shi
473
533
0
12 Apr 2021
HindSight: A Graph-Based Vision Model Architecture For Representing
  Part-Whole Hierarchies
HindSight: A Graph-Based Vision Model Architecture For Representing Part-Whole Hierarchies
Muhammad AbdurRafae
OCL
46
1
0
08 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
LeViT: a Vision Transformer in ConvNet's Clothing for Faster InferenceIEEE International Conference on Computer Vision (ICCV), 2021
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Edouard Grave
Matthijs Douze
ViT
282
932
0
02 Apr 2021
Previous
123...161718
Next