ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.15808
  4. Cited By
CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers

IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
    ViT
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (227★)

Papers citing "CvT: Introducing Convolutions to Vision Transformers"

50 / 860 papers shown
ConDaFormer: Disassembled Transformer with Local Structure Enhancement
  for 3D Point Cloud Understanding
ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding
Lunhao Duan
Shanshan Zhao
Nan Xue
Biwei Huang
Gui-Song Xia
Dacheng Tao
ViT
391
32
0
18 Dec 2023
Agent Attention: On the Integration of Softmax and Linear Attention
Agent Attention: On the Integration of Softmax and Linear AttentionEuropean Conference on Computer Vision (ECCV), 2023
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
381
193
0
14 Dec 2023
Transformer-based Selective Super-Resolution for Efficient Image
  Refinement
Transformer-based Selective Super-Resolution for Efficient Image Refinement
Tianyi Zhang
Kishore Kasichainula
Yaoxin Zhuo
Baoxin Li
Jae-sun Seo
Yu Cao
179
16
0
10 Dec 2023
Graph Convolutions Enrich the Self-Attention in Transformers!
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
401
12
0
07 Dec 2023
Class-Discriminative Attention Maps for Vision Transformers
Class-Discriminative Attention Maps for Vision Transformers
L. Brocki
Jakub Binda
N. C. Chung
MedIm
353
7
0
04 Dec 2023
MobileUtr: Revisiting the relationship between light-weight CNN and
  Transformer for efficient medical image segmentation
MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation
Fenghe Tang
Bingkun Nian
Jianrui Ding
Quan Quan
Jie Yang
Wei Liu
S.Kevin Zhou
ViTMedIm
233
6
0
04 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
807
1
0
01 Dec 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2023
Dai Shi
ViT
296
261
0
28 Nov 2023
Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for
  Vision-Language Tracking
Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking
Jiawei Ge
Xiangmei Chen
Jiuxin Cao
Xueling Zhu
Bo Liu
VLM
374
11
0
28 Nov 2023
Advancing Vision Transformers with Group-Mix Attention
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Lichao Sun
Jiangliu Wang
Yibing Song
Ping Luo
328
30
0
26 Nov 2023
Pursing the Sparse Limitation of Spiking Deep Learning Structures
Pursing the Sparse Limitation of Spiking Deep Learning Structures
Hao-Ran Cheng
Jiahang Cao
Erjia Xiao
Mengshu Sun
Le Yang
Jize Zhang
Xue Lin
B. Kailkhura
Kaidi Xu
Renjing Xu
206
1
0
18 Nov 2023
Vision Big Bird: Random Sparsification for Full Attention
Vision Big Bird: Random Sparsification for Full Attention
Zhemin Zhang
Xun Gong
ViT
163
1
0
10 Nov 2023
Mini but Mighty: Finetuning ViTs with Mini Adapters
Mini but Mighty: Finetuning ViTs with Mini AdaptersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Imad Eddine Marouf
Enzo Tartaglione
Stéphane Lathuilière
177
11
0
07 Nov 2023
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
GTP-ViT: Efficient Vision Transformers via Graph-based Token PropagationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Xuwei Xu
Sen Wang
Yudong Chen
Yanping Zheng
Zhewei Wei
Jiajun Liu
ViT
307
20
0
06 Nov 2023
Dense Video Captioning: A Survey of Techniques, Datasets and Evaluation
  Protocols
Dense Video Captioning: A Survey of Techniques, Datasets and Evaluation ProtocolsACM Computing Surveys (ACM Comput. Surv.), 2023
Iqra Qasim
Alexander Horsch
Dilip K. Prasad
254
14
0
05 Nov 2023
Scattering Vision Transformer: Spectral Mixing Matters
Scattering Vision Transformer: Spectral Mixing MattersNeural Information Processing Systems (NeurIPS), 2023
Badri N. Patro
Vijay Srinivas Agneeswaran
416
27
0
02 Nov 2023
Distilling Knowledge from CNN-Transformer Models for Enhanced Human
  Action Recognition
Distilling Knowledge from CNN-Transformer Models for Enhanced Human Action RecognitionInternational Conference on Computer and Knowledge Engineering (ICCKE), 2023
Hamid Ahmadabadi
Omid Nejati Manzari
Ahmad Ayatollahi
126
9
0
02 Nov 2023
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked
  Autoencoders
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked AutoencodersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Srijan Das
Tanmay Jain
Dominick Reilly
P. Balaji
Soumyajit Karmakar
Shyam Marjit
Xiang Li
Abhijit Das
Michael S. Ryoo
308
24
0
31 Oct 2023
MIST: Medical Image Segmentation Transformer with Convolutional
  Attention Mixing (CAM) Decoder
MIST: Medical Image Segmentation Transformer with Convolutional Attention Mixing (CAM) DecoderIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Md Motiur Rahman
Shiva Shokouhmand
Smriti Bhatt
M. Faezipour
MedIm
293
30
0
30 Oct 2023
ViR: Towards Efficient Vision Retention Backbones
ViR: Towards Efficient Vision Retention Backbones
Ali Hatamizadeh
Michael Ranzinger
Shiyi Lan
Jose M. Alvarez
Sanja Fidler
Jan Kautz
GNN
175
3
0
30 Oct 2023
AViTMP: A Tracking-Specific Transformer for Single-Branch Visual
  Tracking
AViTMP: A Tracking-Specific Transformer for Single-Branch Visual TrackingIEEE Transactions on Intelligent Vehicles (TIV), 2023
Ju Huang
Kai Wang
Joost van de Weijer
Jianlin Zhang
Yongmei Huang
372
1
0
30 Oct 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual RecognitionIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
551
93
0
30 Oct 2023
Exploring Shape Embedding for Cloth-Changing Person Re-Identification
  via 2D-3D Correspondences
Exploring Shape Embedding for Cloth-Changing Person Re-Identification via 2D-3D CorrespondencesACM Multimedia (ACM MM), 2023
Yubin Wang
Huimin Yu
Yuming Yan
Shuyi Song
Biyang Liu
Yichong Lu
3DPC
243
13
0
27 Oct 2023
Generalizing to Unseen Domains in Diabetic Retinopathy Classification
Generalizing to Unseen Domains in Diabetic Retinopathy ClassificationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Chamuditha Jayanga Galappaththige
Gayal Kuruppu
Muhammad Haris Khan
OOD
327
12
0
26 Oct 2023
Bridging The Gaps Between Token Pruning and Full Pre-training via Masked
  Fine-tuning
Bridging The Gaps Between Token Pruning and Full Pre-training via Masked Fine-tuning
Fengyuan Shi
Limin Wang
ViT
165
0
0
26 Oct 2023
Toward Flare-Free Images: A Survey
Toward Flare-Free Images: A Survey
Yousef Kotp
Marwan Torki
262
5
0
22 Oct 2023
Multimodal Transformer Using Cross-Channel attention for Object
  Detection in Remote Sensing Images
Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing ImagesInternational Conference on Information Photonics (ICIP), 2023
Bissmella Bahaduri
Zuheng Ming
Fangchen Feng
Anissa Mokraou
261
8
0
21 Oct 2023
LeTFuser: Light-weight End-to-end Transformer-Based Sensor Fusion for
  Autonomous Driving with Multi-Task Learning
LeTFuser: Light-weight End-to-end Transformer-Based Sensor Fusion for Autonomous Driving with Multi-Task Learning
Pedram Agand
Mohammad Mahdavian
Manolis Savva
Mo Chen
ViT
332
4
0
19 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision
  Transformers
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLMViT
250
7
0
19 Oct 2023
Camera-LiDAR Fusion with Latent Contact for Place Recognition in
  Challenging Cross-Scenes
Camera-LiDAR Fusion with Latent Contact for Place Recognition in Challenging Cross-Scenes
Yan Pan
Jiapeng Xie
Jiajie Wu
Bo Zhou
263
1
0
16 Oct 2023
Accelerating Vision Transformers Based on Heterogeneous Attention
  Patterns
Accelerating Vision Transformers Based on Heterogeneous Attention Patterns
Deli Yu
Teng Xi
Jianwei Li
Baopu Li
Gang Zhang
Haocheng Feng
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
ViT
269
2
0
11 Oct 2023
Distance Weighted Trans Network for Image Completion
Distance Weighted Trans Network for Image Completion
Pourya Shamsolmoali
Masoumeh Zareapoor
Huiyu Zhou
Xuelong Li
Yue Lu
ViT
206
0
0
11 Oct 2023
Distilling Efficient Vision Transformers from CNNs for Semantic
  Segmentation
Distilling Efficient Vision Transformers from CNNs for Semantic SegmentationPattern Recognition (Pattern Recogn.), 2023
Xueye Zheng
Yunhao Luo
Pengyuan Zhou
Lin Wang
220
31
0
11 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-AttentionIEEE Transactions on Cybernetics (IEEE Trans. Cybern.), 2023
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
381
8
0
10 Oct 2023
No Token Left Behind: Efficient Vision Transformer via Dynamic Token
  Idling
No Token Left Behind: Efficient Vision Transformer via Dynamic Token IdlingApplied Informatics (AI), 2023
Xuwei Xu
Changlin Li
Yudong Chen
Xiaojun Chang
Jiajun Liu
Sen Wang
ViT
229
10
0
09 Oct 2023
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision
  Transformers
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision TransformersInternational Conference on Digital Image Computing: Techniques and Applications (DICTA), 2023
Xuwei Xu
Sen Wang
Yudong Chen
Jiajun Liu
ViT
244
1
0
09 Oct 2023
Enhancing Representations through Heterogeneous Self-Supervised Learning
Enhancing Representations through Heterogeneous Self-Supervised LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Zhongyu Li
Bo-Wen Yin
Yongxiang Liu
Tianpeng Liu
Ming-Ming Cheng
SSL
362
3
0
08 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
471
12
0
08 Oct 2023
TiC: Exploring Vision Transformer in Convolution
TiC: Exploring Vision Transformer in Convolution
Song Zhang
Qingzhong Wang
Jiang Bian
Haoyi Xiong
ViT
187
1
0
06 Oct 2023
ClusVPR: Efficient Visual Place Recognition with Clustering-based
  Weighted Transformer
ClusVPR: Efficient Visual Place Recognition with Clustering-based Weighted TransformerIEEE Transactions on Artificial Intelligence (IEEE TAI), 2023
Yifan Xu
Pourya Shamsolmoali
Jie Yang
ViT
257
2
0
06 Oct 2023
TransRadar: Adaptive-Directional Transformer for Real-Time Multi-View
  Radar Semantic Segmentation
TransRadar: Adaptive-Directional Transformer for Real-Time Multi-View Radar Semantic SegmentationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Yahia Dalbah
Jean Lahoud
Hisham Cholakkal
247
14
0
03 Oct 2023
Towards Training Without Depth Limits: Batch Normalization Without
  Gradient Explosion
Towards Training Without Depth Limits: Batch Normalization Without Gradient ExplosionInternational Conference on Learning Representations (ICLR), 2023
Alexandru Meterez
Amir Joudaki
Francesco Orabona
Alexander Immer
Gunnar Rätsch
Hadi Daneshmand
206
9
0
03 Oct 2023
Understanding Masked Autoencoders From a Local Contrastive Perspective
Understanding Masked Autoencoders From a Local Contrastive Perspective
Xiaoyu Yue
Mengwei He
Meng Wei
Jiangmiao Pang
Xihui Liu
Luping Zhou
Wanli Ouyang
SSL
320
10
0
03 Oct 2023
PPT: Token Pruning and Pooling for Efficient Vision Transformers
PPT: Token Pruning and Pooling for Efficient Vision Transformers
Xinjian Wu
Fanhu Zeng
Xiudong Wang
Xinghao Chen
ViT
278
34
0
03 Oct 2023
SeisT: A foundational deep learning model for earthquake monitoring
  tasks
SeisT: A foundational deep learning model for earthquake monitoring tasksIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Sen Li
Xu Yang
Anye Cao
Changbin Wang
Yaoqi Liu
Yapeng Liu
Qiang Niu
218
9
0
02 Oct 2023
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Win-Win: Training High-Resolution Vision Transformers from Two WindowsInternational Conference on Learning Representations (ICLR), 2023
Vincent Leroy
Jérôme Revaud
Thomas Lucas
Philippe Weinzaepfel
ViT
274
6
0
01 Oct 2023
RBFormer: Improve Adversarial Robustness of Transformer by Robust Bias
RBFormer: Improve Adversarial Robustness of Transformer by Robust BiasBritish Machine Vision Conference (BMVC), 2023
Hao Cheng
Jinhao Duan
Hui Li
Lyutianyang Zhang
Jiahang Cao
Ping Wang
Jize Zhang
Kaidi Xu
Renjing Xu
AAML
193
4
0
23 Sep 2023
Investigating Efficient Deep Learning Architectures For Side-Channel
  Attacks on AES
Investigating Efficient Deep Learning Architectures For Side-Channel Attacks on AES
Yohai-Eliel Berreby
L. Sauvage
AAML
128
3
0
22 Sep 2023
CINFormer: Transformer network with multi-stage CNN feature injection
  for surface defect segmentation
CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation
Xiaoheng Jiang
Kaiyi Guo
Yang Lu
Feng Yan
Hao Liu
Jiale Cao
Mingliang Xu
Dacheng Tao
MedImViTUQCV
173
2
0
22 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2023
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
600
172
0
20 Sep 2023
Previous
123...567...161718
Next