Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (227★)
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 860 papers shown
Fast Vision Transformers with HiLo Attention
Neural Information Processing Systems (NeurIPS), 2022
Zizheng Pan
Jianfei Cai
Bohan Zhuang
448
249
0
26 May 2022
Concurrent Neural Tree and Data Preprocessing AutoML for Image Classification
Anish Thite
Mohan Dodda
Pulak Agarwal
Jason Zutty
155
3
0
25 May 2022
Inception Transformer
Neural Information Processing Systems (NeurIPS), 2022
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
357
257
0
25 May 2022
MoCoViT: Mobile Convolutional Vision Transformer
Hailong Ma
Xin Xia
Xing Wang
Xuefeng Xiao
Jiashi Li
Min Zheng
ViT
388
21
0
25 May 2022
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification
Pattern Recognition (Pattern Recogn.), 2022
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marccal Rusinol
O. R. Terrades
VLM
284
36
0
24 May 2022
Transformer based Generative Adversarial Network for Liver Segmentation
Ugur Demir
Zheyu Zhang
Sijin Yu
M. Antalek
Elif Keles
Debesh Jha
Amir Borhani
Daniela Ladner
Ulas Bagci
ViT
MedIm
189
17
0
21 May 2022
Boosting Camouflaged Object Detection with Dual-Task Interactive Transformer
International Conference on Pattern Recognition (ICPR), 2022
Zheng Liu
Zhili Zhang
Wei Wu
224
76
0
21 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
305
86
0
20 May 2022
TRT-ViT: TensorRT-oriented Vision Transformer
Xin Xia
Jiashi Li
Jie Wu
Xing Wang
Xuefeng Xiao
Min Zheng
Rui Wang
ViT
224
35
0
19 May 2022
Learning Rate Curriculum
International Journal of Computer Vision (IJCV), 2022
Florinel-Alin Croitoru
Nicolae-Cătălin Ristea
Radu Tudor Ionescu
Andrii Zadaianchuk
262
26
0
18 May 2022
Vision Transformer Adapter for Dense Predictions
International Conference on Learning Representations (ICLR), 2022
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
913
766
0
17 May 2022
POViT: Vision Transformer for Multi-objective Design and Characterization of Nanophotonic Devices
Xinyu Chen
Renjie Li
Yueyao Yu
Yuanwen Shen
Wenye Li
Zhaoyu Zhang
Yin Zhang
ViT
326
2
0
17 May 2022
ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
International Conference on Machine Learning (ICML), 2022
Haoran You
Baopu Li
Huihong Shi
Y. Fu
Yingyan Lin
362
19
0
17 May 2022
Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
Computer Vision and Pattern Recognition (CVPR), 2022
Luke Melas-Kyriazi
Christian Rupprecht
Iro Laina
Andrea Vedaldi
303
188
0
16 May 2022
Transformers in 3D Point Clouds: A Survey
Dening Lu
Qian Xie
Mingqiang Wei
Kyle Gao
Linlin Xu
Jonathan Li
3DPC
ViT
327
65
0
16 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Computer Vision and Pattern Recognition (CVPR), 2022
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
523
914
0
09 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
Shiyang Feng
Teli Ma
Jiaming Song
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
256
151
0
08 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
European Conference on Computer Vision (ECCV), 2022
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
Łukasz Dudziak
Jiaming Song
Georgios Tzimiropoulos
Brais Martínez
ViT
369
245
0
06 May 2022
Symmetric Transformer-based Network for Unsupervised Image Registration
Knowledge-Based Systems (KBS), 2022
Mingrui Ma
Lei Song
Yuanbo Xu
Gui-Xian Liu
ViT
MedIm
160
48
0
28 Apr 2022
Self-Supervised Learning of Object Parts for Semantic Segmentation
Computer Vision and Pattern Recognition (CVPR), 2022
A. Ziegler
Yuki M. Asano
SSL
OCL
330
127
0
27 Apr 2022
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Xianing Chen
Qiong Cao
Yujie Zhong
Jing Zhang
Shenghua Gao
Dacheng Tao
ViT
255
102
0
27 Apr 2022
Adaptive Split-Fusion Transformer
IEEE International Conference on Multimedia and Expo (ICME), 2022
Zixuan Su
Hao Zhang
Yue Yu
Lei Pang
Chong-Wah Ngo
Yu-Gang Jiang
ViT
255
8
0
26 Apr 2022
TranSiam: Fusing Multimodal Visual Features Using Transformer for Medical Image Segmentation
Xia Li
Shiqiang Ma
Jijun Tang
Fei Guo
ViT
MedIm
95
12
0
26 Apr 2022
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Rui Tian
Zuxuan Wu
Jingdong Sun
Han Hu
Yu-Gang Jiang
ViT
AAML
309
6
0
26 Apr 2022
High-Efficiency Lossy Image Coding Through Adaptive Neighborhood Information Aggregation
Ming Lu
Fangdong Chen
Shiliang Pu
Zhan Ma
156
54
0
25 Apr 2022
Residual Mixture of Experts
Lemeng Wu
Xiyang Dai
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
354
46
0
20 Apr 2022
DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks
Ziyang Luo
Yadong Xi
Jing Ma
Zhiwei Yang
Xiaoxi Mao
Changjie Fan
Rongsheng Zhang
142
4
0
19 Apr 2022
Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
Computer Vision and Pattern Recognition (CVPR), 2022
Wang Zeng
Sheng Jin
Wentao Liu
Chao Qian
Ping Luo
Ouyang Wanli
Xiaogang Wang
ViT
271
166
0
19 Apr 2022
The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training
AAAI Conference on Artificial Intelligence (AAAI), 2022
Hao Liu
Xinghua Jiang
Xin Li
Antai Guo
Deqiang Jiang
Bo Ren
189
43
0
18 Apr 2022
ResT V2: Simpler, Faster and Stronger
Neural Information Processing Systems (NeurIPS), 2022
Qing-Long Zhang
Yubin Yang
ViT
249
31
0
15 Apr 2022
MiniViT: Compressing Vision Transformers with Weight Multiplexing
Computer Vision and Pattern Recognition (CVPR), 2022
Jinnian Zhang
Houwen Peng
Kan Wu
Xiyang Dai
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
289
152
0
14 Apr 2022
Neighborhood Attention Transformer
Computer Vision and Pattern Recognition (CVPR), 2022
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViT
AI4TS
414
403
0
14 Apr 2022
DeiT III: Revenge of the ViT
European Conference on Computer Vision (ECCV), 2022
Hugo Touvron
Matthieu Cord
Edouard Grave
ViT
295
545
0
14 Apr 2022
SwinNet: Swin Transformer drives edge-aware RGB-D and RGB-T salient object detection
Zhengyi Liu
Yacheng Tan
Qian He
Yun Xiao
ViT
324
284
0
12 Apr 2022
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
Computer Vision and Pattern Recognition (CVPR), 2022
Wenqiang Zhang
Zilong Huang
Guozhong Luo
Tao Chen
Xinggang Wang
Wenyu Liu
Gang Yu
Chunhua Shen
ViT
280
260
0
12 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
International Conference on Machine Learning (ICML), 2022
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
206
36
0
10 Apr 2022
Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes for Medical Image Super-Resolution
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Mariana-Iuliana Georgescu
Radu Tudor Ionescu
A. Miron
O. Savencu
Nicolae-Cătălin Ristea
N. Verga
Fahad Shahbaz Khan
SupR
162
82
0
08 Apr 2022
DaViT: Dual Attention Vision Transformers
European Conference on Computer Vision (ECCV), 2022
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
382
344
0
07 Apr 2022
Unified Contrastive Learning in Image-Text-Label Space
Computer Vision and Pattern Recognition (CVPR), 2022
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Bin Xiao
Ce Liu
Lu Yuan
Jianfeng Gao
VLM
SSL
319
273
0
07 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
Computer Vision and Pattern Recognition (CVPR), 2022
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDE
ViT
208
130
0
06 Apr 2022
SE(3)-Equivariant Attention Networks for Shape Reconstruction in Function Space
International Conference on Learning Representations (ICLR), 2022
Evangelos Chatzipantazis
Stefanos Pertigkiozoglou
Guang Cheng
Kostas Daniilidis
3DPC
338
38
0
05 Apr 2022
MaxViT: Multi-Axis Vision Transformer
European Conference on Computer Vision (ECCV), 2022
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
517
906
0
04 Apr 2022
Matching Feature Sets for Few-Shot Image Classification
Computer Vision and Pattern Recognition (CVPR), 2022
Arman Afrasiyabi
Hugo Larochelle
Jean-François Lalonde
Christian Gagné
VLM
271
96
0
02 Apr 2022
CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection
Computer Vision and Pattern Recognition (CVPR), 2022
Yanan Zhang
Jiaxin Chen
Di Huang
ViT
3DPC
360
70
0
01 Apr 2022
Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Interspeech (Interspeech), 2022
Gasser Elbanna
A. Biryukov
Neil Scheidwasser
Lara Orlandic
Pablo Mainar
M. Kegler
P. Beckmann
Milos Cernak
201
13
0
30 Mar 2022
ITTR: Unpaired Image-to-Image Translation with Transformers
Wanfeng Zheng
Qiang Li
Guoxin Zhang
Pengfei Wan
Zhong-ming Wang
ViT
170
25
0
30 Mar 2022
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
Computer Vision and Pattern Recognition (CVPR), 2022
Gyutae Park
S. Son
Jaeyoung Yoo
Seho Kim
Nojun Kwak
ViT
243
86
0
29 Mar 2022
SepViT: Separable Vision Transformer
Wei Li
Xing Wang
Xin Xia
Jie Wu
Jiashi Li
Xuefeng Xiao
Min Zheng
Shiping Wen
ViT
278
49
0
29 Mar 2022
MAT: Mask-Aware Transformer for Large Hole Image Inpainting
Computer Vision and Pattern Recognition (CVPR), 2022
Wenbo Li
Zhe Lin
Kun Zhou
Lu Qi
Yi Wang
Jiaya Jia
389
435
0
29 Mar 2022
Affine Medical Image Registration with Coarse-to-Fine Vision Transformer
Computer Vision and Pattern Recognition (CVPR), 2022
Tony C. W. Mok
Albert C. S. Chung
ViT
MedIm
204
79
0
29 Mar 2022
Previous
1
2
3
...
12
13
14
...
16
17
18
Next
Page 13 of 18
Page
of 18
Go