ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.15808
  4. Cited By
CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers

29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
    ViT
ArXivPDFHTML

Papers citing "CvT: Introducing Convolutions to Vision Transformers"

50 / 249 papers shown
Title
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
Young-Hu Park
R.-H. Park
Hyung-Min Park
49
0
0
07 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
107
0
0
06 May 2025
AI Assisted Cervical Cancer Screening for Cytology Samples in Developing Countries
AI Assisted Cervical Cancer Screening for Cytology Samples in Developing Countries
Love Panta
Suraj Prasai
Karishma Malla Vaidya
Shyam Shrestha
Suresh Manandhar
44
0
0
29 Apr 2025
HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection
HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection
YangChen Zeng
ViT
31
0
0
18 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
65
0
0
03 Apr 2025
VesselSAM: Leveraging SAM for Aortic Vessel Segmentation with LoRA and Atrous Attention
VesselSAM: Leveraging SAM for Aortic Vessel Segmentation with LoRA and Atrous Attention
Adnan Iltaf
Rayan Merghani Ahmed
Bin Li
Bin Li
Shoujun Zhou
50
0
0
25 Feb 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
112
0
0
25 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
114
0
0
21 Jan 2025
Image Classification with Deep Reinforcement Active Learning
Image Classification with Deep Reinforcement Active Learning
Mingyuan Jiu
Xuguang Song
H. Sahbi
Shupan Li
Yan Chen
Wei Guo
Lihua Guo
Mingliang Xu
VLM
24
0
0
31 Dec 2024
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
33
0
0
12 Nov 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide
  Image Analysis
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
32
4
0
18 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
64
3
0
14 Oct 2024
Beyond Skip Connection: Pooling and Unpooling Design for Elimination
  Singularities
Beyond Skip Connection: Pooling and Unpooling Design for Elimination Singularities
Chengkun Sun
Jinqian Pan
Juoli Jin
Russell Stevens Terry
Jiang Bian
Jie Xu
20
0
0
20 Sep 2024
Advancing Depth Anything Model for Unsupervised Monocular Depth Estimation in Endoscopy
Advancing Depth Anything Model for Unsupervised Monocular Depth Estimation in Endoscopy
Bojian Li
Bo Liu
Jinghua Yue
F. Zhou
Fugen Zhou
MedIm
MDE
45
2
0
12 Sep 2024
Brain-Inspired Stepwise Patch Merging for Vision Transformers
Brain-Inspired Stepwise Patch Merging for Vision Transformers
Yonghao Yu
Dongcheng Zhao
Guobin Shen
Yiting Dong
Yi Zeng
45
0
0
11 Sep 2024
Exploring Rich Subjective Quality Information for Image Quality
  Assessment in the Wild
Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild
Xiongkuo Min
Yixuan Gao
Yuqin Cao
Guangtao Zhai
Wenjun Zhang
Huifang Sun
C. Chen
20
10
0
09 Sep 2024
MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient
  Semantic Segmentation
MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation
Beoungwoo Kang
Seunghun Moon
Yubin Cho
Hyunwoo Yu
Suk-Ju Kang
ViT
MedIm
24
8
0
14 Aug 2024
MacFormer: Semantic Segmentation with Fine Object Boundaries
MacFormer: Semantic Segmentation with Fine Object Boundaries
Guoan Xu
Wenfeng Huang
Tao Wu
Ligeng Chen
Wenjing Jia
Guangwei Gao
Xiatian Zhu
Stuart W. Perry
31
0
0
11 Aug 2024
Depth-Wise Convolutions in Vision Transformers for Efficient Training on
  Small Datasets
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
Tianxiao Zhang
Wenju Xu
Bo Luo
Guanghui Wang
ViT
MDE
36
7
0
28 Jul 2024
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa
Qin Shiyin
Abdulrahman Hamman Adama Chukkol
Isah Bello
A. Lawan
Isah Bello
39
4
0
10 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
33
4
0
10 Jul 2024
Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies
Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies
Ivan Drokin
42
19
0
01 Jul 2024
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual
  Tracking
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
Xiangyang Yang
Dan Zeng
Xucheng Wang
You Wu
Hengzhou Ye
Qijun Zhao
Shuiwang Li
53
3
0
12 Jun 2024
ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking
ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking
Xudong Han
Nobuyuki Oishi
Yueying Tian
Elif Ucurum
R. Young
C. Chatwin
Philip Birch
30
3
0
24 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
A separability-based approach to quantifying generalization: which layer
  is best?
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
24
3
0
02 May 2024
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal
Zhuohao Li
Guoyang Xie
Guannan Jiang
Zhichao Lu
29
3
0
29 Apr 2024
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
Bingchen Li
Xin Li
Yiting Lu
Ruoyu Feng
Mengxi Guo
Shijie Zhao
Li Zhang
Zhibo Chen
30
13
0
26 Apr 2024
MathNet: A Data-Centric Approach for Printed Mathematical Expression
  Recognition
MathNet: A Data-Centric Approach for Printed Mathematical Expression Recognition
Felix M. Schmitt-Koopmann
Elaine M. Huang
Hans-Peter Hutter
Thilo Stadelmann
Alireza Darvishy
24
4
0
21 Apr 2024
Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature
  Processing
Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature Processing
Yuang Liu
Zhiheng Qiu
Xiaokai Qin
ViT
31
0
0
20 Apr 2024
Training Transformer Models by Wavelet Losses Improves Quantitative and
  Visual Performance in Single Image Super-Resolution
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution
Cansu Korkmaz
A. Murat Tekalp
ViT
36
6
0
17 Apr 2024
Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures
Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures
Ching-Kai Lin
Di-Chun Wei
Yun-Chien Cheng
27
0
0
09 Apr 2024
IPT-V2: Efficient Image Processing Transformer using Hierarchical
  Attentions
IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions
Zhijun Tu
Kunpeng Du
Hanting Chen
Hai-lin Wang
Wei Li
Jie Hu
Yunhe Wang
ViT
31
4
0
31 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
Exploring Dynamic Transformer for Efficient Object Tracking
Exploring Dynamic Transformer for Efficient Object Tracking
Jiawen Zhu
Xin Chen
Haiwen Diao
Shuai Li
Jun-Yan He
Chenyang Li
Bin Luo
Dong Wang
Huchuan Lu
41
2
0
26 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
23
15
0
18 Mar 2024
Activating Wider Areas in Image Super-Resolution
Activating Wider Areas in Image Super-Resolution
Cheng Cheng
Hang Wang
Hongbin Sun
32
10
0
13 Mar 2024
Learning Correction Errors via Frequency-Self Attention for Blind Image
  Super-Resolution
Learning Correction Errors via Frequency-Self Attention for Blind Image Super-Resolution
Haochen Sun
Yan Yuan
Lijuan Su
Hao-Yu Shao
33
1
0
12 Mar 2024
Zero-shot generalization across architectures for visual classification
Zero-shot generalization across architectures for visual classification
Evan Gerritz
Luciano Dyballa
Steven W. Zucker
21
1
0
21 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
47
4
0
17 Feb 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
49
5
0
22 Jan 2024
Evaluating Data Augmentation Techniques for Coffee Leaf Disease
  Classification
Evaluating Data Augmentation Techniques for Coffee Leaf Disease Classification
Adrian Gheorghiu
Iulian-Marius Taiatu
Dumitru-Clementin Cercel
Iuliana Marin
Florin-Catalin Pop
49
2
0
11 Jan 2024
Graph Convolutions Enrich the Self-Attention in Transformers!
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
25
4
0
07 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
18
0
0
01 Dec 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
36
36
0
30 Oct 2023
LeTFuser: Light-weight End-to-end Transformer-Based Sensor Fusion for
  Autonomous Driving with Multi-Task Learning
LeTFuser: Light-weight End-to-end Transformer-Based Sensor Fusion for Autonomous Driving with Multi-Task Learning
Pedram Agand
Mohammad Mahdavian
Manolis Savva
Mo Chen
ViT
21
4
0
19 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision
  Transformers
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
32
7
0
19 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
29
4
0
10 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
32
3
0
08 Oct 2023
SRN-SZ: Deep Leaning-Based Scientific Error-bounded Lossy Compression
  with Super-resolution Neural Networks
SRN-SZ: Deep Leaning-Based Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks
Jinyang Liu
Sheng Di
Sian Jin
Kai Zhao
Xin Liang
Zizhong Chen
Franck Cappello
25
8
0
07 Sep 2023
12345
Next