ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.15808
  4. Cited By
CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers

IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
    ViT
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (227★)

Papers citing "CvT: Introducing Convolutions to Vision Transformers"

50 / 857 papers shown
Title
Self-Paced Learning for Images of Antinuclear Antibodies
Self-Paced Learning for Images of Antinuclear AntibodiesIEEE Transactions on Medical Imaging (IEEE TMI), 2025
Yiyang Jiang
Guangwu Qian
Jiaxin Wu
Qi Huang
Qing Li
Yongkang Wu
Xiao-Yong Wei
52
0
0
26 Nov 2025
Rethinking Vision Transformer Depth via Structural Reparameterization
Rethinking Vision Transformer Depth via Structural Reparameterization
Chengwei Zhou
Vipin Chaudhary
Gourav Datta
ViT
52
0
0
24 Nov 2025
EVCC: Enhanced Vision Transformer-ConvNeXt-CoAtNet Fusion for Classification
EVCC: Enhanced Vision Transformer-ConvNeXt-CoAtNet Fusion for Classification
Kazi Reyazul Hasan
M. Rahman
Wasif Jalal
Sadif Ahmed
Shahriar Raj
Mubasshira Musarrat
Muhammad Abdullah Adnan
ViT
36
0
0
24 Nov 2025
Exploring Weak-to-Strong Generalization for CLIP-based Classification
Exploring Weak-to-Strong Generalization for CLIP-based Classification
Jinhao Li
Sarah Monazam Erfani
Lei Feng
James Bailey
Feng Liu
VLM
168
0
0
23 Nov 2025
CoMA: Complementary Masking and Hierarchical Dynamic Multi-Window Self-Attention in a Unified Pre-training Framework
CoMA: Complementary Masking and Hierarchical Dynamic Multi-Window Self-Attention in a Unified Pre-training Framework
Jiaxuan Li
Qing Xu
Xiangjian He
Ziyu Liu
Chang Xing
Zhen Chen
Daokun Zhang
Rong Qu
Chang Wen Chen
60
0
0
08 Nov 2025
GroupKAN: Rethinking Nonlinearity with Grouped Spline-based KAN Modeling for Efficient Medical Image Segmentation
GroupKAN: Rethinking Nonlinearity with Grouped Spline-based KAN Modeling for Efficient Medical Image Segmentation
Guojie Li
Anwar P.P. Abdul Majeed
Muhammad Ateeq
Anh Nguyen
Fan Zhang
MedIm
48
0
0
07 Nov 2025
A Hybrid Deep Learning Model for Robust Biometric Authentication from Low-Frame-Rate PPG Signals
A Hybrid Deep Learning Model for Robust Biometric Authentication from Low-Frame-Rate PPG Signals
Arfina Rahman
Mahesh K. Banavar
165
0
0
06 Nov 2025
UniSOT: A Unified Framework for Multi-Modality Single Object Tracking
UniSOT: A Unified Framework for Multi-Modality Single Object TrackingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yinchao Ma
Yuyang Tang
Wenfei Yang
Tianzhu Zhang
Xu Zhou
Feng Wu
144
0
0
03 Nov 2025
Alias-Free ViT: Fractional Shift Invariance via Linear Attention
Alias-Free ViT: Fractional Shift Invariance via Linear Attention
H. Michaeli
Daniel Soudry
84
0
0
26 Oct 2025
3rd Place Solution to Large-scale Fine-grained Food Recognition
3rd Place Solution to Large-scale Fine-grained Food Recognition
Yang Zhong
Yifan Yao
Tong Luo
Y. Zhang
Yaqian Li
CVBM3DPC
136
0
0
24 Oct 2025
Attentive Convolution: Unifying the Expressivity of Self-Attention with Convolutional Efficiency
Attentive Convolution: Unifying the Expressivity of Self-Attention with Convolutional Efficiency
Hao Yu
H. G. Chen
Yan Jiang
Wei Peng
Zhaodong Sun
Samuel Kaski
Guoying Zhao
81
0
0
23 Oct 2025
Dual-attention ResNet outperforms transformers in HER2 prediction on DCE-MRI
Dual-attention ResNet outperforms transformers in HER2 prediction on DCE-MRI
Naomi Fridman
Anat Goldstein
MedIm
44
0
0
14 Oct 2025
Automated Neural Architecture Design for Industrial Defect Detection
Automated Neural Architecture Design for Industrial Defect Detection
Y. Liu
Yunfeng Ma
Yi Tang
Min Liu
Shuai Jiang
Yaonan Wang
80
0
0
08 Oct 2025
On knot detection via picture recognition
On knot detection via picture recognition
Anne Dranowski
Yura Kabkov
Daniel Tubbenhauer
44
0
0
06 Oct 2025
A Mathematical Explanation of Transformers for Large Language Models and GPTs
A Mathematical Explanation of Transformers for Large Language Models and GPTs
X. Tai
Hao Liu
Lingfeng Li
Raymond H. F. Chan
AI4CE
98
1
0
05 Oct 2025
Allocation of Parameters in Transformers
Allocation of Parameters in Transformers
Ruoxi Yu
Haotian Jiang
Jingpu Cheng
Penghao Yu
Qianxiao Li
Zhong Li
MoE
118
0
0
04 Oct 2025
AttentionViG: Cross-Attention-Based Dynamic Neighbor Aggregation in Vision GNNs
AttentionViG: Cross-Attention-Based Dynamic Neighbor Aggregation in Vision GNNs
Hakan Emre Gedik
Andrew Martin
Mustafa Munir
Oguzhan Baser
R. Marculescu
Sandeep Chinchali
Alan C. Bovik
ViT
73
0
0
29 Sep 2025
FastViDAR: Real-Time Omnidirectional Depth Estimation via Alternative Hierarchical Attention
FastViDAR: Real-Time Omnidirectional Depth Estimation via Alternative Hierarchical Attention
Hangtian Zhao
Xiang Chen
Yizhe Li
Qianhao Wang
Haibo Lu
Fei Gao
MDE
86
0
0
28 Sep 2025
Random Direct Preference Optimization for Radiography Report Generation
Random Direct Preference Optimization for Radiography Report Generation
Valentin Samokhin
B. Shirokikh
M. Goncharov
Dmitriy Umerenkov
Maksim Bobrin
Ivan Oseledets
Dmitry V. Dylov
Mikhail Belyaev
60
0
0
19 Sep 2025
Single Domain Generalization in Diabetic Retinopathy: A Neuro-Symbolic Learning Approach
Single Domain Generalization in Diabetic Retinopathy: A Neuro-Symbolic Learning Approach
Midhat Urooj
Ayan Banerjee
Farhat Shaikh
Kuntal Thakur
S. Gupta
MedIm
72
0
0
03 Sep 2025
Enhancing compact convolutional transformers with super attention
Enhancing compact convolutional transformers with super attention
Simpenzwe Honore Leandre
Natenaile Asmamaw Shiferaw
Dillip Rout
ViTVLM
64
0
0
26 Aug 2025
Lightweight Backbone Networks Only Require Adaptive Lightweight Self-Attention Mechanisms
Lightweight Backbone Networks Only Require Adaptive Lightweight Self-Attention Mechanisms
Fengyun Li
Chao Zheng
Yangyang Fang
Jialiang Lan
Jianhua Liang
Luhao Zhang
Fa Si
124
0
0
02 Aug 2025
Foundation Models for Bioacoustics -- a Comparative Review
Foundation Models for Bioacoustics -- a Comparative Review
Raphael Schwinger
Paria Vali Zadeh
Lukas Rauch
Mats Kurz
Tom Hauschild
Sam Lapp
Sven Tomforde
VLM
97
1
0
02 Aug 2025
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
Renhang Liu
Chia-Yu Hung
Navonil Majumder
Taylor Gautreaux
Amir Ali Bagherzadeh
Chuan Li
Dorien Herremans
Soujanya Poria
DiffM
131
3
0
28 Jul 2025
Towards Universal Modal Tracking with Online Dense Temporal Token Learning
Towards Universal Modal Tracking with Online Dense Temporal Token LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yaozong Zheng
Bineng Zhong
Qihua Liang
Shengping Zhang
Guorong Li
Xianxian Li
Rongrong Ji
133
18
0
27 Jul 2025
Foundation Models and Transformers for Anomaly Detection: A Survey
Foundation Models and Transformers for Anomaly Detection: A SurveyInformation Fusion (Inf. Fusion), 2025
Mouin Ben Ammar
Arturo Mendoza
Nacim Belkhir
Antoine Manzanera
Gianni Franchi
136
4
0
21 Jul 2025
Frequency-Dynamic Attention Modulation for Dense Prediction
Frequency-Dynamic Attention Modulation for Dense Prediction
Linwei Chen
Lin Gu
Ying Fu
454
0
0
16 Jul 2025
EEG Foundation Models: A Critical Review of Current Progress and Future Directions
EEG Foundation Models: A Critical Review of Current Progress and Future Directions
Gayal Kuruppu
Neeraj Wagh
Y. Varatharajah
209
0
0
15 Jul 2025
DuoFormer: Leveraging Hierarchical Representations by Local and Global Attention Vision Transformer
DuoFormer: Leveraging Hierarchical Representations by Local and Global Attention Vision Transformer
Xiaoya Tang
Bodong Zhang
M. M. Ho
Beatrice Knudsen
Tolga Tasdizen
ViTMedIm
117
0
0
15 Jun 2025
Delayformer: spatiotemporal transformation for predicting high-dimensional dynamics
Delayformer: spatiotemporal transformation for predicting high-dimensional dynamics
Zijian Wang
Peng Tao
Luonan Chen
AI4TSAI4CE
114
1
0
13 Jun 2025
Enhancing Deepfake Detection using SE Block Attention with CNN
Enhancing Deepfake Detection using SE Block Attention with CNN
Subhram Dasgupta
Janelle Mason
Xiaohong Yuan
Olusola Odeyomi
Kaushik Roy
231
0
0
12 Jun 2025
Foundation Models in Medical Imaging: A Review and Outlook
Foundation Models in Medical Imaging: A Review and Outlook
Vivien van Veldhuizen
Vanessa Botha
C. Lu
Melis Erdal Cesur
Kevin Groot Lipman
...
Cees Snoek
Lodewyk Wessels
Ritse Mann
Eric Marcus
Jonas Teuwen
MedImVLMAI4CE
304
2
0
10 Jun 2025
Can Vision Transformers with ResNet's Global Features Fairly Authenticate Demographic Faces?
Can Vision Transformers with ResNet's Global Features Fairly Authenticate Demographic Faces?International Conference on Pattern Recognition (ICPR), 2025
Abu Sufian
Marco Leo
Cosimo Distante
Anirudha Ghosh
Debaditya Barman
ViT
136
1
0
03 Jun 2025
S2AFormer: Strip Self-Attention for Efficient Vision Transformer
S2AFormer: Strip Self-Attention for Efficient Vision Transformer
Guoan Xu
Wenfeng Huang
Wenjing Jia
Jiamao Li
Guangwei Gao
Guo-Jun Qi
187
0
0
28 May 2025
Vision Transformers with Self-Distilled Registers
Vision Transformers with Self-Distilled Registers
Yinjie Chen
Zipeng Yan
Chong Zhou
Bo Dai
Andrew F. Luo
342
4
0
27 May 2025
Structured Initialization for Vision Transformers
Structured Initialization for Vision Transformers
Jianqiao Zheng
Xueqian Li
Hemanth Saratchandran
Simon Lucey
ViT
137
0
0
26 May 2025
PiT: Progressive Diffusion Transformer
PiT: Progressive Diffusion Transformer
Jiafu Wu
Yabiao Wang
Jian Li
Jinlong Peng
Yun Cao
Chengjie Wang
Jiangning Zhang
480
0
0
19 May 2025
A 2D Semantic-Aware Position Encoding for Vision Transformers
A 2D Semantic-Aware Position Encoding for Vision Transformers
Xi Chen
Shiyang Zhou
Muqi Huang
Jiaxu Feng
Yun Xiong
...
Yujiao Shi
Huishuai Bao
Sijia Peng
Chong Li
Feng Shi
ViT
221
1
0
14 May 2025
FAD: Frequency Adaptation and Diversion for Cross-domain Few-shot Learning
FAD: Frequency Adaptation and Diversion for Cross-domain Few-shot Learning
Ruixiao Shi
Fu Feng
Yucheng Xie
Jing Wang
Xin Geng
225
1
0
13 May 2025
Hyb-KAN ViT: Hybrid Kolmogorov-Arnold Networks Augmented Vision Transformer
Hyb-KAN ViT: Hybrid Kolmogorov-Arnold Networks Augmented Vision Transformer
Sainath Dey
Mitul Goswami
Jashika Sethi
Prasant Kumar Pattnaik
ViT
215
0
0
07 May 2025
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
Young-Hu Park
R.-H. Park
Hyung-Min Park
268
3
0
07 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
1.0K
2
0
06 May 2025
Variational diffusion transformers for conditional sampling of supernovae spectra
Variational diffusion transformers for conditional sampling of supernovae spectra
Yunyi Shen
Alexander T. Gagliano
DiffM
107
2
0
05 May 2025
AI Assisted Cervical Cancer Screening for Cytology Samples in Developing Countries
AI Assisted Cervical Cancer Screening for Cytology Samples in Developing Countries
Love Panta
Suraj Prasai
Karishma Malla Vaidya
Shyam Shrestha
Suresh Manandhar
269
0
0
29 Apr 2025
Group Downsampling with Equivariant Anti-aliasing
Group Downsampling with Equivariant Anti-aliasingInternational Conference on Learning Representations (ICLR), 2025
Md Ashiqur Rahman
Raymond A. Yeh
250
4
0
24 Apr 2025
ECViT: Efficient Convolutional Vision Transformer with Local-Attention and Multi-scale Stages
ECViT: Efficient Convolutional Vision Transformer with Local-Attention and Multi-scale Stages
Zhoujie Qian
ViT
235
1
0
21 Apr 2025
Fighting Fires from Space: Leveraging Vision Transformers for Enhanced Wildfire Detection and Characterization
Fighting Fires from Space: Leveraging Vision Transformers for Enhanced Wildfire Detection and Characterization
Aman Agarwal
James Gearon
Raksha Rank
Etienne Chenevert
140
0
0
18 Apr 2025
HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection
HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection
YangChen Zeng
ViT
194
1
0
18 Apr 2025
Graph Network for Sign Language Tasks
Graph Network for Sign Language Tasks
Shiwei Gan
Yafeng Yin
Zhiwei Jiang
Hongkai Wen
Lei Xie
Sanglu Lu
SLR
377
0
0
16 Apr 2025
GFT: Gradient Focal Transformer
GFT: Gradient Focal Transformer
Boris Kriuk
Simranjit Kaur Gill
Shoaib Aslam
Amir Fakhrutdinov
168
0
0
14 Apr 2025
1234...161718
Next