Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 249 papers shown
Title
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
Young-Hu Park
R.-H. Park
Hyung-Min Park
49
0
0
07 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
110
0
0
06 May 2025
AI Assisted Cervical Cancer Screening for Cytology Samples in Developing Countries
Love Panta
Suraj Prasai
Karishma Malla Vaidya
Shyam Shrestha
Suresh Manandhar
44
0
0
29 Apr 2025
HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection
YangChen Zeng
ViT
31
0
0
18 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
67
0
0
03 Apr 2025
VesselSAM: Leveraging SAM for Aortic Vessel Segmentation with LoRA and Atrous Attention
Adnan Iltaf
Rayan Merghani Ahmed
Bin Li
Bin Li
Shoujun Zhou
50
0
0
25 Feb 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
115
0
0
25 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
117
0
0
21 Jan 2025
Image Classification with Deep Reinforcement Active Learning
Mingyuan Jiu
Xuguang Song
H. Sahbi
Shupan Li
Yan Chen
Wei Guo
Lihua Guo
Mingliang Xu
VLM
24
0
0
31 Dec 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
33
0
0
12 Nov 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
32
4
0
18 Oct 2024
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
64
3
0
14 Oct 2024
Beyond Skip Connection: Pooling and Unpooling Design for Elimination Singularities
Chengkun Sun
Jinqian Pan
Juoli Jin
Russell Stevens Terry
Jiang Bian
Jie Xu
20
0
0
20 Sep 2024
Advancing Depth Anything Model for Unsupervised Monocular Depth Estimation in Endoscopy
Bojian Li
Bo Liu
Jinghua Yue
F. Zhou
Fugen Zhou
MedIm
MDE
45
2
0
12 Sep 2024
Brain-Inspired Stepwise Patch Merging for Vision Transformers
Yonghao Yu
Dongcheng Zhao
Guobin Shen
Yiting Dong
Yi Zeng
45
0
0
11 Sep 2024
Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild
Xiongkuo Min
Yixuan Gao
Yuqin Cao
Guangtao Zhai
Wenjun Zhang
Huifang Sun
C. Chen
20
10
0
09 Sep 2024
MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation
Beoungwoo Kang
Seunghun Moon
Yubin Cho
Hyunwoo Yu
Suk-Ju Kang
ViT
MedIm
24
8
0
14 Aug 2024
MacFormer: Semantic Segmentation with Fine Object Boundaries
Guoan Xu
Wenfeng Huang
Tao Wu
Ligeng Chen
Wenjing Jia
Guangwei Gao
Xiatian Zhu
Stuart W. Perry
31
0
0
11 Aug 2024
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
Tianxiao Zhang
Wenju Xu
Bo Luo
Guanghui Wang
ViT
MDE
36
7
0
28 Jul 2024
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa
Qin Shiyin
Abdulrahman Hamman Adama Chukkol
Isah Bello
A. Lawan
Isah Bello
39
4
0
10 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
33
4
0
10 Jul 2024
Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies
Ivan Drokin
42
19
0
01 Jul 2024
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
Xiangyang Yang
Dan Zeng
Xucheng Wang
You Wu
Hengzhou Ye
Qijun Zhao
Shuiwang Li
53
3
0
12 Jun 2024
ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking
Xudong Han
Nobuyuki Oishi
Yueying Tian
Elif Ucurum
R. Young
C. Chatwin
Philip Birch
30
3
0
24 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
24
3
0
02 May 2024
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal
Zhuohao Li
Guoyang Xie
Guannan Jiang
Zhichao Lu
31
3
0
29 Apr 2024
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
Bingchen Li
Xin Li
Yiting Lu
Ruoyu Feng
Mengxi Guo
Shijie Zhao
Li Zhang
Zhibo Chen
30
13
0
26 Apr 2024
MathNet: A Data-Centric Approach for Printed Mathematical Expression Recognition
Felix M. Schmitt-Koopmann
Elaine M. Huang
Hans-Peter Hutter
Thilo Stadelmann
Alireza Darvishy
24
4
0
21 Apr 2024
Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature Processing
Yuang Liu
Zhiheng Qiu
Xiaokai Qin
ViT
31
0
0
20 Apr 2024
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution
Cansu Korkmaz
A. Murat Tekalp
ViT
36
6
0
17 Apr 2024
Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures
Ching-Kai Lin
Di-Chun Wei
Yun-Chien Cheng
27
0
0
09 Apr 2024
IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions
Zhijun Tu
Kunpeng Du
Hanting Chen
Hai-lin Wang
Wei Li
Jie Hu
Yunhe Wang
ViT
31
4
0
31 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
Exploring Dynamic Transformer for Efficient Object Tracking
Jiawen Zhu
Xin Chen
Haiwen Diao
Shuai Li
Jun-Yan He
Chenyang Li
Bin Luo
Dong Wang
Huchuan Lu
41
2
0
26 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
23
15
0
18 Mar 2024
Activating Wider Areas in Image Super-Resolution
Cheng Cheng
Hang Wang
Hongbin Sun
32
10
0
13 Mar 2024
Learning Correction Errors via Frequency-Self Attention for Blind Image Super-Resolution
Haochen Sun
Yan Yuan
Lijuan Su
Hao-Yu Shao
33
1
0
12 Mar 2024
Zero-shot generalization across architectures for visual classification
Evan Gerritz
Luciano Dyballa
Steven W. Zucker
23
1
0
21 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
47
4
0
17 Feb 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
49
5
0
22 Jan 2024
Evaluating Data Augmentation Techniques for Coffee Leaf Disease Classification
Adrian Gheorghiu
Iulian-Marius Taiatu
Dumitru-Clementin Cercel
Iuliana Marin
Florin-Catalin Pop
49
2
0
11 Jan 2024
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
25
4
0
07 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
18
0
0
01 Dec 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
36
36
0
30 Oct 2023
LeTFuser: Light-weight End-to-end Transformer-Based Sensor Fusion for Autonomous Driving with Multi-Task Learning
Pedram Agand
Mohammad Mahdavian
Manolis Savva
Mo Chen
ViT
21
4
0
19 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
32
7
0
19 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
29
4
0
10 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
32
3
0
08 Oct 2023
SRN-SZ: Deep Leaning-Based Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks
Jinyang Liu
Sheng Di
Sian Jin
Kai Zhao
Xin Liang
Zizhong Chen
Franck Cappello
25
8
0
07 Sep 2023
1
2
3
4
5
Next