Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.15808
Cited By
CvT: Introducing Convolutions to Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
29 March 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (227★)
Papers citing
"CvT: Introducing Convolutions to Vision Transformers"
50 / 860 papers shown
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
International Conference on Learning Representations (ICLR), 2022
Kunchang Li
Yali Wang
Shiyang Feng
Guanglu Song
Yu Liu
Jiaming Song
Yu Qiao
ViT
489
320
0
12 Jan 2022
A ConvNet for the 2020s
Computer Vision and Pattern Recognition (CVPR), 2022
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
627
7,167
0
10 Jan 2022
Spatio-Temporal Tuples Transformer for Skeleton-Based Action Recognition
Helei Qiu
B. Hou
Bo Ren
Xiaohua Zhang
ViT
231
62
0
08 Jan 2022
QuadTree Attention for Vision Transformers
International Conference on Learning Representations (ICLR), 2022
Shitao Tang
Jiahui Zhang
Siyu Zhu
Ping Tan
ViT
494
188
0
08 Jan 2022
Lumbar Bone Mineral Density Estimation from Chest X-ray Images: Anatomy-aware Attentive Multi-ROI Modeling
IEEE Transactions on Medical Imaging (IEEE TMI), 2022
Fakai Wang
K. Zheng
Le Lu
Jing Xiao
Min Wu
C. Kuo
S. Miao
161
25
0
05 Jan 2022
Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention
Haotian Yan
Chuang Zhang
Ming Wu
ViT
396
78
0
05 Jan 2022
PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture
Kai Han
Jianyuan Guo
Yehui Tang
Yunhe Wang
ViT
227
22
0
04 Jan 2022
Vision Transformer with Deformable Attention
Computer Vision and Pattern Recognition (CVPR), 2022
Zhuofan Xia
Xuran Pan
Qing Xiao
Li Erran Li
Gao Huang
ViT
450
704
0
03 Jan 2022
HPRN: Holistic Prior-embedded Relation Network for Spectral Super-Resolution
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Chaoxiong Wu
Jiaojiao Li
Rui Song
Yunsong Li
Qian Du
185
32
0
29 Dec 2021
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
AAAI Conference on Artificial Intelligence (AAAI), 2021
Sitong Wu
Tianyi Wu
Hao Hao Tan
G. Guo
ViT
248
83
0
28 Dec 2021
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
European Conference on Computer Vision (ECCV), 2021
Zhenglun Kong
Zhaoyang Han
Xiaolong Ma
Xin Meng
Mengshu Sun
...
Geng Yuan
Bin Ren
Minghai Qin
Hao Tang
Yanzhi Wang
ViT
326
196
0
27 Dec 2021
Vision Transformer for Small-Size Datasets
Seung Hoon Lee
Seunghyun Lee
B. Song
ViT
246
284
0
27 Dec 2021
Learned Queries for Efficient Local Attention
Computer Vision and Pattern Recognition (CVPR), 2021
Moab Arar
Ariel Shamir
Amit H. Bermano
ViT
263
36
0
21 Dec 2021
MPViT: Multi-Path Vision Transformer for Dense Prediction
Computer Vision and Pattern Recognition (CVPR), 2021
Youngwan Lee
Jonghee Kim
Jeffrey Willette
Sung Ju Hwang
ViT
322
321
0
21 Dec 2021
Lite Vision Transformer with Enhanced Self-Attention
Computer Vision and Pattern Recognition (CVPR), 2021
Chenglin Yang
Yilin Wang
Jianming Zhang
Chentao Song
Zijun Wei
Zhe Lin
Alan Yuille
ViT
248
149
0
20 Dec 2021
StyleSwin: Transformer-based GAN for High-resolution Image Generation
Computer Vision and Pattern Recognition (CVPR), 2021
Bo Zhang
Shuyang Gu
Bo Zhang
Jianmin Bao
Dong Chen
Fang Wen
Yong Wang
B. Guo
ViT
459
293
0
20 Dec 2021
Towards End-to-End Image Compression and Analysis with Transformers
Yuanchao Bai
Xu Yang
Xianming Liu
Junjun Jiang
Yaowei Wang
Xiangyang Ji
Wen Gao
ViT
261
62
0
17 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
196
3
0
10 Dec 2021
Locally Shifted Attention With Early Global Integration
Shelly Sheynin
Sagie Benaim
Adam Polyak
Lior Wolf
ViT
93
0
0
09 Dec 2021
3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis
Jianhui Yu
Chaoyi Zhang
Heng Wang
Dingxin Zhang
Yang Song
Tiange Xiang
Dongnan Liu
Weidong (Tom) Cai
ViT
MedIm
196
45
0
09 Dec 2021
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Rui Dai
Srijan Das
Kumara Kahatapitiya
Michael S. Ryoo
Francois Bremond
ViT
303
98
0
07 Dec 2021
Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning
DeepMind Interactive Agents Team Josh Abramson
Josh Abramson
Arun Ahuja
Arthur Brussee
Federico Carnevale
...
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Rui Zhu
LM&Ro
282
49
0
07 Dec 2021
Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training
Haofei Zhang
Jiarui Duan
Mengqi Xue
Mingli Song
Li Sun
Xiuming Zhang
ViT
AI4CE
278
16
0
07 Dec 2021
GETAM: Gradient-weighted Element-wise Transformer Attention Map for Weakly-supervised Semantic segmentation
Weixuan Sun
Jing Zhang
Zheyuan Liu
Yiran Zhong
Nick Barnes
ViT
226
15
0
06 Dec 2021
Dynamic Token Normalization Improves Vision Transformers
International Conference on Learning Representations (ICLR), 2021
Wenqi Shao
Yixiao Ge
Zhaoyang Zhang
Xuyuan Xu
Xiaogang Wang
Ying Shan
Ping Luo
ViT
326
12
0
05 Dec 2021
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Xiyang Dai
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
297
248
0
02 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
492
850
0
02 Dec 2021
Vision Pair Learning: An Efficient Training Framework for Image Classification
Bei Tong
Xiaoyuan Yu
ViT
126
0
0
02 Dec 2021
AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Lingchen Meng
Hengduo Li
Bor-Chun Chen
Shiyi Lan
Zuxuan Wu
Yu-Gang Jiang
Ser-Nam Lim
ViT
246
294
0
30 Nov 2021
Adaptive Token Sampling For Efficient Vision Transformers
Mohsen Fayyaz
Soroush Abbasi Koohpayegani
F. Jafari
Sunando Sengupta
Hamid Reza Vaezi Joze
Eric Sommerlade
Hamed Pirsiavash
Juergen Gall
ViT
379
220
0
30 Nov 2021
TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions
Jeya Maria Jose Valanarasu
R. Yasarla
Vishal M. Patel
ViT
350
413
0
29 Nov 2021
On the Integration of Self-Attention and Convolution
Computer Vision and Pattern Recognition (CVPR), 2021
Xuran Pan
Chunjiang Ge
Rui Lu
Qing Xiao
Guanfu Chen
Zeyi Huang
Gao Huang
SSL
305
432
0
29 Nov 2021
SWAT: Spatial Structure Within and Among Tokens
International Joint Conference on Artificial Intelligence (IJCAI), 2021
Kumara Kahatapitiya
Michael S. Ryoo
272
8
0
26 Nov 2021
NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition
Hao Liu
Xinghua Jiang
Xin Li
Zhimin Bao
Deqiang Jiang
Bo Ren
ViT
185
19
0
25 Nov 2021
Self-slimmed Vision Transformer
Zhuofan Zong
Kunchang Li
Guanglu Song
Yali Wang
Yu Qiao
B. Leng
Yu Liu
ViT
305
45
0
24 Nov 2021
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
Moritz Ibing
Gregor Kobsik
Leif Kobbelt
217
42
0
24 Nov 2021
Florence: A New Foundation Model for Computer Vision
Lu Yuan
Dongdong Chen
Yi-Ling Chen
Noel Codella
Xiyang Dai
...
Zhen Xiao
Jianwei Yang
Michael Zeng
Luowei Zhou
Pengchuan Zhang
VLM
409
1,060
0
22 Nov 2021
MetaFormer Is Actually What You Need for Vision
Computer Vision and Pattern Recognition (CVPR), 2021
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
541
1,198
0
22 Nov 2021
Semi-Supervised Vision Transformers
European Conference on Computer Vision (ECCV), 2021
Zejia Weng
Xitong Yang
Ang Li
Zuxuan Wu
Yu-Gang Jiang
ViT
194
51
0
22 Nov 2021
CpT: Convolutional Point Transformer for 3D Point Cloud Processing
Chaitanya Kaul
Joshua Mitton
H. Dai
Roderick Murray-Smith
3DPC
114
11
0
21 Nov 2021
Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints
Jaesin Ahn
Jiuk Hong
Jeongwoo Ju
Heechul Jung
ViT
233
4
0
19 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
237
39
0
16 Nov 2021
Attention Mechanisms in Computer Vision: A Survey
Computational Visual Media (CVM), 2021
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
306
2,129
0
15 Nov 2021
Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
British Machine Vision Conference (BMVC), 2021
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
181
3
0
15 Nov 2021
A Survey of Visual Transformers
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Peng Wang
Jianping Fan
Zhiqiang He
3DGS
ViT
473
487
0
11 Nov 2021
Sliced Recursive Transformer
European Conference on Computer Vision (ECCV), 2021
Zhiqiang Shen
Zechun Liu
Eric P. Xing
ViT
216
28
0
09 Nov 2021
Convolutional Gated MLP: Combining Convolutions & gMLP
A. Rajagopal
V. Nirmala
136
22
0
06 Nov 2021
Blending Anti-Aliasing into Vision Transformer
Neural Information Processing Systems (NeurIPS), 2021
Shengju Qian
Hao Shao
Yi Zhu
Mu Li
Jiaya Jia
213
23
0
28 Oct 2021
MVT: Multi-view Vision Transformer for 3D Object Recognition
British Machine Vision Conference (BMVC), 2021
Shuo Chen
Tan Yu
Ping Li
ViT
135
57
0
25 Oct 2021
CvT-ASSD: Convolutional vision-Transformer Based Attentive Single Shot MultiBox Detector
IEEE International Conference on Tools with Artificial Intelligence (ICTAI), 2021
Weiqiang Jin
Hang Yu
Xiangfeng Luo
ViT
115
16
0
24 Oct 2021
Previous
1
2
3
...
14
15
16
17
18
Next
Page 15 of 18
Page
of 18
Go