Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (227★)
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 857 papers shown
Title
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
Saksham Suri
Matthew Walmer
Kamal Gupta
Abhinav Shrivastava
210
16
0
21 Mar 2024
Retina Vision Transformer (RetinaViT): Introducing Scaled Patches into Vision Transformers
Yuyang Shu
Michael E. Bain
MedIm
MDE
ViT
163
0
0
20 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
147
34
0
18 Mar 2024
D-Net: Dynamic Large Kernel with Dynamic Feature Fusion for Volumetric Medical Image Segmentation
Jin Yang
Peijie Qiu
Yichi Zhang
Daniel S. Marcus
Aristeidis Sotiras
MedIm
149
32
0
15 Mar 2024
Group-Mix SAM: Lightweight Solution for Industrial Assembly Line Applications
Wu Liang
X.-G. Ma
143
0
0
15 Mar 2024
Activating Wider Areas in Image Super-Resolution
Cheng Cheng
Hang Wang
Hongbin Sun
186
15
0
13 Mar 2024
Learning Correction Errors via Frequency-Self Attention for Blind Image Super-Resolution
International Conference on Image, Vision and Computing (ICIVC), 2024
Haochen Sun
Yan Yuan
Lijuan Su
Hao-Yu Shao
157
2
0
12 Mar 2024
LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation
Jinhong Wang
Jintai Chen
Benlin Liu
Jian Wu
Mamba
223
46
0
12 Mar 2024
Explainable Transformer Prototypes for Medical Diagnoses
IEEE International Symposium on Biomedical Imaging (ISBI), 2024
Ugur Demir
Debesh Jha
Zheyu Zhang
Elif Keles
Bradley Allen
Aggelos K. Katsaggelos
Ulas Bagci
MedIm
86
4
0
11 Mar 2024
GRITv2: Efficient and Light-weight Social Relation Recognition
Sagar Reddy
Neeraj Kasera
Avinash Thakur
ViT
124
0
0
11 Mar 2024
Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition
Junde Wu
Jiayuan Zhu
Min Xu
Yueming Jin
208
0
0
08 Mar 2024
Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis
Abdelrahman Abdallah
Daniel Eberharter
Zoe Pfister
Adam Jatowt
153
15
0
06 Mar 2024
A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning
Yuelin Zhang
Pengyu Zheng
Wanquan Yan
Chengyu Fang
Shing Shin Cheng
MedIm
268
13
0
05 Mar 2024
AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation
Haonan Wang
Qixiang Zhang
Yi Li
Xiaomeng Li
372
38
0
04 Mar 2024
ViTaL: An Advanced Framework for Automated Plant Disease Identification in Leaf Images Using Vision Transformers and Linear Projection For Feature Reduction
Abhishek Sebastian
A. AnnisFathima
R. Pragna
S. MadhanKumar
G. YaswanthKannan
Vinay Murali
MedIm
190
8
0
27 Feb 2024
A Comparison of Deep Learning Models for Proton Background Rejection with the AMS Electromagnetic Calorimeter
R. K. Hashmani
Emre Akbas
M. Demirköz
65
2
0
26 Feb 2024
Zero-shot generalization across architectures for visual classification
Evan Gerritz
Luciano Dyballa
Steven W. Zucker
258
1
0
21 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
385
8
0
17 Feb 2024
TDViT: Temporal Dilated Video Transformer for Dense Video Tasks
Guanxiong Sun
Yang Hua
Guosheng Hu
N. Robertson
ViT
127
1
0
14 Feb 2024
Exploring the Synergies of Hybrid CNNs and ViTs Architectures for Computer Vision: A survey
Engineering applications of artificial intelligence (EAAI), 2024
Haruna Yunusa
Shiyin Qin
Abdulrahman Hamman Adama Chukkol
Abdulganiyu Abdu Yusuf
Isah Bello
A. Lawan
ViT
237
32
0
05 Feb 2024
Spatio-temporal Prompting Network for Robust Video Feature Extraction
Guanxiong Sun
Chi Wang
Zhaoyu Zhang
Jiankang Deng
Stefanos Zafeiriou
Yang Hua
ViT
158
7
0
04 Feb 2024
ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image Segmentation
Zihan Li
Yuan Zheng
Dandan Shan
Shuzhou Yang
Qingde Li
Beizhan Wang
Yuan-ting Zhang
Qingqi Hong
Dinggang Shen
ViT
MedIm
230
73
0
03 Feb 2024
HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
Han Liu
Zhi Xu
Xiaotong Zhang
Feng Zhang
Fenglong Ma
Hongyang Chen
Hong Yu
Xianchao Zhang
AAML
174
19
0
02 Feb 2024
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Zihan Zhong
Zhiqiang Tang
Tong He
Haoyang Fang
Chun Yuan
217
77
0
31 Jan 2024
Local and Global Contexts for Conversation
Zuoquan Lin
Xinyi Shen
129
1
0
31 Jan 2024
SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design
Computer Vision and Pattern Recognition (CVPR), 2024
Seokju Yun
Youngmin Ro
ViT
346
81
0
29 Jan 2024
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
110
0
0
26 Jan 2024
Convolutional Initialization for Data-Efficient Vision Transformers
Jianqiao Zheng
Xueqian Li
Simon Lucey
208
2
0
23 Jan 2024
Anisotropy Is Inherent to Self-Attention in Transformers
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
215
29
0
22 Jan 2024
Colorectal Polyp Segmentation in the Deep Learning Era: A Comprehensive Survey
Zhenyu Wu
Fengmao Lv
Chenglizhao Chen
Aimin Hao
Shuo Li
ELM
209
20
0
22 Jan 2024
Cloud-based XAI Services for Assessing Open Repository Models Under Adversarial Attacks
Zerui Wang
Yan Liu
AAML
175
7
0
22 Jan 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Neural Networks (Neural Netw.), 2023
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
245
9
0
22 Jan 2024
Unifying Visual and Vision-Language Tracking via Contrastive Learning
AAAI Conference on Artificial Intelligence (AAAI), 2024
Yinchao Ma
Yuyang Tang
Wenfei Yang
Tianzhu Zhang
Jinpeng Zhang
Mengxue Kang
ObjD
177
38
0
20 Jan 2024
Learning Position-Aware Implicit Neural Network for Real-World Face Inpainting
Bo Zhao
Huan Yang
Jianlong Fu
CVBM
147
1
0
19 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
International Conference on Machine Learning (ICML), 2024
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
351
1,289
0
17 Jan 2024
Efficient generative adversarial networks using linear additive-attention Transformers
Emilio Morales-Juarez
Gibran Fuentes Pineda
386
4
0
17 Jan 2024
Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization
IEEE Transactions on Image Processing (TIP), 2024
Qi Bi
Wei Ji
Jingjun Yi
Haolan Zhan
Gui-Song Xia
438
3
0
16 Jan 2024
Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
Wei Ye
Chaoya Jiang
Haiyang Xu
Chenhao Ye
Chenliang Li
Mingshi Yan
Shikun Zhang
Songhang Huang
Fei Huang
VLM
146
1
0
11 Jan 2024
Evaluating Data Augmentation Techniques for Coffee Leaf Disease Classification
International Conference on Agents and Artificial Intelligence (ICAART), 2024
Adrian Gheorghiu
Iulian-Marius Taiatu
Dumitru-Clementin Cercel
Iuliana Marin
Florin-Catalin Pop
207
3
0
11 Jan 2024
Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach
IEEE Transactions on Image Processing (TIP), 2024
Gang Wu
Junjun Jiang
Junpeng Jiang
Xianming Liu
SupR
175
26
0
11 Jan 2024
LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition
Youbing Hu
Yun Cheng
Anqi Lu
Zhiqiang Cao
Dawei Wei
Jie Liu
Zhijun Li
ViT
184
15
0
08 Jan 2024
SeTformer is What You Need for Vision and Language
Pourya Shamsolmoali
Masoumeh Zareapoor
Eric Granger
Michael Felsberg
164
6
0
07 Jan 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Yaoyao Liu
Cihang Xie
ViT
MDE
244
13
0
05 Jan 2024
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using Neural ODE
Ikumi Okubo
Keisuke Sugiura
Hiroki Matsutani
176
2
0
05 Jan 2024
ClST: A Convolutional Transformer Framework for Automatic Modulation Recognition by Knowledge Distillation
IEEE Transactions on Wireless Communications (IEEE TWC), 2023
Dongbin Hou
Lixin Li
Wensheng Lin
Junli Liang
Zhu Han
73
10
0
29 Dec 2023
Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction
Jingdong Zhang
Jiayuan Fan
Peng Ye
Bo Zhang
Hancheng Ye
Baopu Li
Yancheng Cai
Tao Chen
136
0
0
21 Dec 2023
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Zhaoyang Zhang
Wenqi Shao
Yixiao Ge
Xiaogang Wang
Liang Feng
Ping Luo
170
4
0
20 Dec 2023
ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding
Lunhao Duan
Shanshan Zhao
Nan Xue
Biwei Huang
Gui-Song Xia
Dacheng Tao
ViT
355
32
0
18 Dec 2023
Agent Attention: On the Integration of Softmax and Linear Attention
European Conference on Computer Vision (ECCV), 2023
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
295
175
0
14 Dec 2023
Transformer-based Selective Super-Resolution for Efficient Image Refinement
Tianyi Zhang
Kishore Kasichainula
Yaoxin Zhuo
Baoxin Li
Jae-sun Seo
Yu Cao
126
15
0
10 Dec 2023
Previous
1
2
3
4
5
6
...
16
17
18
Next