ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.15808
  4. Cited By
CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers

29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
    ViT
ArXivPDFHTML

Papers citing "CvT: Introducing Convolutions to Vision Transformers"

50 / 249 papers shown
Title
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
Zhixin Ling
Zhen Xing
Xiangdong Zhou
Manliang Cao
G. Zhou
ViT
21
17
0
28 Aug 2023
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for
  Multi-Label Image Recognition
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition
Ruijie Yao
Sheng Jin
Lumin Xu
Wang Zeng
Wentao Liu
Chao Qian
Ping Luo
Ji Wu
21
2
0
28 Aug 2023
TurboViT: Generating Fast Vision Transformers via Generative
  Architecture Search
TurboViT: Generating Fast Vision Transformers via Generative Architecture Search
Alexander Wong
Saad Abbasi
Saeejith Nair
ViT
23
1
0
22 Aug 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
37
3
0
18 Aug 2023
Asynchronous Evolution of Deep Neural Network Architectures
Asynchronous Evolution of Deep Neural Network Architectures
J. Liang
H. Shahrzad
Risto Miikkulainen
18
0
0
08 Aug 2023
M2Former: Multi-Scale Patch Selection for Fine-Grained Visual
  Recognition
M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition
Ji-Hee Moon
Junseok K. Lee
Yu-Ling Lee
Seongsik Park
22
4
0
04 Aug 2023
Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly
  Detectors
Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
Nicolae-Cătălin Ristea
Florinel-Alin Croitoru
Radu Tudor Ionescu
Marius Popescu
F. Khan
M. Shah
ViT
26
20
0
21 Jun 2023
Speaker Embeddings as Individuality Proxy for Voice Stress Detection
Speaker Embeddings as Individuality Proxy for Voice Stress Detection
Zihan Wu
Neil Scheidwasser
Karl El Hajal
Milos Cernak
24
3
0
09 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene
  Understanding
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye
Dan Xu
ViT
23
10
0
08 Jun 2023
DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection
DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection
Rui Shao
Tianxing Wu
Liqiang Nie
Ziwei Liu
16
11
0
01 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
34
28
0
01 Jun 2023
Efficient Mixed Transformer for Single Image Super-Resolution
Efficient Mixed Transformer for Single Image Super-Resolution
Ling Zheng
Jinchen Zhu
Jinpeng Shi
Shizhuang Weng
35
19
0
19 May 2023
Two-Stream Regression Network for Dental Implant Position Prediction
Two-Stream Regression Network for Dental Implant Position Prediction
Xinquan Yang
Xuguang Li
Xuechen Li
Wenting Chen
Linlin Shen
X. Li
Yongqiang Deng
18
6
0
17 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
10
0
0
17 May 2023
TransFlow: Transformer as Flow Learner
TransFlow: Transformer as Flow Learner
Yawen Lu
Qifan Wang
Siqi Ma
Tong Geng
Victor Y. Chen
Huaijin Chen
Dongfang Liu
ViT
25
45
0
23 Apr 2023
FastViT: A Fast Hybrid Vision Transformer using Structural
  Reparameterization
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
ViT
26
149
0
24 Mar 2023
Machine Learning for Brain Disorders: Transformers and Visual
  Transformers
Machine Learning for Brain Disorders: Transformers and Visual Transformers
Robin Courant
Maika Edberg
Nicolas Dufour
Vicky Kalogeiton
MedIm
ViT
27
1
0
21 Mar 2023
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training
  Efficiency
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Vithursan Thangarasa
Shreyas Saxena
Abhay Gupta
Sean Lie
21
3
0
21 Mar 2023
Tracker Meets Night: A Transformer Enhancer for UAV Tracking
Tracker Meets Night: A Transformer Enhancer for UAV Tracking
Junjie Ye
Changhong Fu
Ziang Cao
Shan An
Guang-Zheng Zheng
Bowen Li
35
51
0
20 Mar 2023
Retinal Image Restoration using Transformer and Cycle-Consistent
  Generative Adversarial Network
Retinal Image Restoration using Transformer and Cycle-Consistent Generative Adversarial Network
Alnur Alimanov
Md Baharul Islam
ViT
MedIm
11
4
0
03 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
23
24
0
02 Mar 2023
Human MotionFormer: Transferring Human Motions with Vision Transformers
Human MotionFormer: Transferring Human Motions with Vision Transformers
Hongyu Liu
Xintong Han
Chengbin Jin
Lihui Qian
Huawei Wei
...
Faqiang Wang
Haoye Dong
Yibing Song
Jia Xu
Qifeng Chen
11
10
0
22 Feb 2023
Device Tuning for Multi-Task Large Model
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
11
0
0
21 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for
  Simultaneous CT Image Denoising and Deblurring
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring
Zhihao Chen
Chuang Niu
Qi Gao
Ge Wang
Hongming Shan
MedIm
ViT
3DV
25
20
0
21 Feb 2023
Soft Error Reliability Analysis of Vision Transformers
Soft Error Reliability Analysis of Vision Transformers
Xing-xiong Xue
Cheng Liu
Ying Wang
Bing Yang
Tao Luo
L. Zhang
Huawei Li
Xiaowei Li
34
14
0
21 Feb 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image
  Classification
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
21
176
0
19 Feb 2023
Efficiency 360: Efficient Vision Transformers
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
21
6
0
16 Feb 2023
Efficient Attention via Control Variates
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
24
18
0
09 Feb 2023
Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
Kaiwen Zhang
Jialun Peng
Jingjing Fu
Dong Liu
ViT
19
8
0
24 Jan 2023
FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection
FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection
Shuailei Ma
Yuefeng Wang
Shanze Wang
Ying-yu Wei
28
33
0
08 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
13
25
0
05 Jan 2023
A Close Look at Spatial Modeling: From Attention to Convolution
A Close Look at Spatial Modeling: From Attention to Convolution
Xu Ma
Huan Wang
Can Qin
Kunpeng Li
Xing Zhao
Jie Fu
Yun Fu
ViT
3DPC
17
11
0
23 Dec 2022
EIT: Enhanced Interactive Transformer
EIT: Enhanced Interactive Transformer
Tong Zheng
Bei Li
Huiwen Bao
Tong Xiao
Jingbo Zhu
24
2
0
20 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group
  Propagation
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
X. Wang
ViT
30
21
0
13 Dec 2022
Part-based Face Recognition with Vision Transformers
Part-based Face Recognition with Vision Transformers
Zhonglin Sun
Georgios Tzimiropoulos
ViT
15
15
0
30 Nov 2022
Lightweight Structure-Aware Attention for Visual Understanding
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
26
2
0
29 Nov 2022
Lightning Fast Video Anomaly Detection via Adversarial Knowledge
  Distillation
Lightning Fast Video Anomaly Detection via Adversarial Knowledge Distillation
Florinel-Alin Croitoru
Nicolae-Cătălin Ristea
D. Dascalescu
Radu Tudor Ionescu
F. Khan
M. Shah
31
2
0
28 Nov 2022
Dynamic Feature Pruning and Consolidation for Occluded Person
  Re-Identification
Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification
Yuteng Ye
Hang Zhou
Jiale Cai
Chenxing Gao
Youjia Zhang
Junle Wang
Qiang Hu
Junqing Yu
Wei Yang
23
6
0
27 Nov 2022
Degenerate Swin to Win: Plain Window-based Transformer without
  Sophisticated Operations
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations
Tan Yu
Ping Li
ViT
36
5
0
25 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
23
129
0
22 Nov 2022
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision
  Transformers
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers
Peiyan Dong
Mengshu Sun
Alec Lu
Yanyue Xie
Li-Yu Daisy Liu
...
Xin Meng
Z. Li
Xue Lin
Zhenman Fang
Yanzhi Wang
ViT
26
57
0
15 Nov 2022
ParCNetV2: Oversized Kernel with Enhanced Attention
ParCNetV2: Oversized Kernel with Enhanced Attention
Ruihan Xu
Haokui Zhang
Wenze Hu
Shiliang Zhang
Xiaoyu Wang
ViT
19
6
0
14 Nov 2022
Training a Vision Transformer from scratch in less than 24 hours with 1
  GPU
Training a Vision Transformer from scratch in less than 24 hours with 1 GPU
Saghar Irandoust
Thibaut Durand
Yunduz Rakhmangulova
Wenjie Zi
Hossein Hajimirsadeghi
ViT
27
6
0
09 Nov 2022
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision
  Transformer Acceleration with a Linear Taylor Attention
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Jyotikrishna Dass
Shang Wu
Huihong Shi
Chaojian Li
Zhifan Ye
Zhongfeng Wang
Yingyan Lin
17
49
0
09 Nov 2022
SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker
  Embedding and Vision Transformers
SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers
Alessandro Arezzo
Stefano Berretti
ViT
16
15
0
04 Nov 2022
Pixel-Wise Contrastive Distillation
Pixel-Wise Contrastive Distillation
Junqiang Huang
Zichao Guo
37
4
0
01 Nov 2022
Explicitly Increasing Input Information Density for Vision Transformers
  on Small Datasets
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
38
3
0
25 Oct 2022
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context
  Propagation in Transformers
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers
Zhuo Huang
Zhiyou Zhao
Banghuai Li
Jungong Han
3DPC
ViT
23
55
0
23 Oct 2022
S2WAT: Image Style Transfer via Hierarchical Vision Transformer using
  Strips Window Attention
S2WAT: Image Style Transfer via Hierarchical Vision Transformer using Strips Window Attention
Chi Zhang
Xiaogang Xu
Lei Wang
Zaiyan Dai
Jun Yang
ViT
22
23
0
22 Oct 2022
Face Pyramid Vision Transformer
Face Pyramid Vision Transformer
Khawar Islam
M. Zaheer
Arif Mahmood
ViT
CVBM
24
4
0
21 Oct 2022
Previous
12345
Next