Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.04803
Cited By
CoAtNet: Marrying Convolution and Attention for All Data Sizes
9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CoAtNet: Marrying Convolution and Attention for All Data Sizes"
32 / 482 papers shown
Title
Attention Mechanisms in Computer Vision: A Survey
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
19
1,633
0
15 Nov 2021
Local Multi-Head Channel Self-Attention for Facial Expression Recognition
Roberto Pecoraro
Valerio Basile
Viviana Bono
Sara Gallo
ViT
73
48
0
14 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
69
330
0
11 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
79
96
0
07 Nov 2021
Grafting Transformer on Automatically Designed Convolutional Neural Network for Hyperspectral Image Classification
Xizhe Xue
Haokui Zhang
Bei Fang
Zongwen Bai
Ying Li
ViT
11
22
0
21 Oct 2021
HRFormer: High-Resolution Transformer for Dense Prediction
Yuhui Yuan
Rao Fu
Lang Huang
Weihong Lin
Chao Zhang
Xilin Chen
Jingdong Wang
ViT
24
226
0
18 Oct 2021
StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning
Jinghuan Shang
Kumara Kahatapitiya
Xiang Li
Michael S. Ryoo
OffRL
35
33
0
12 Oct 2021
Adversarial Token Attacks on Vision Transformers
Ameya Joshi
Gauri Jagatap
C. Hegde
ViT
30
19
0
08 Oct 2021
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP
Jihao Liu
Hongsheng Li
Guanglu Song
Xin Huang
Yu Liu
ViT
29
35
0
08 Oct 2021
SIRe-Networks: Convolutional Neural Networks Architectural Extension for Information Preservation via Skip/Residual Connections and Interlaced Auto-Encoders
D. Avola
Luigi Cinque
Alessio Fagioli
G. Foresti
12
3
0
06 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in Generalization
Sara Fridovich-Keil
Raphael Gontijo-Lopes
Rebecca Roelofs
30
28
0
06 Oct 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
189
1,212
0
05 Oct 2021
OH-Former: Omni-Relational High-Order Transformer for Person Re-Identification
Xianing Chen
Chunlin Xu
Qiong Cao
Jialang Xu
Yujie Zhong
Jiale Xu
Zhengxin Li
Jingya Wang
Shenghua Gao
ViT
69
18
0
23 Sep 2021
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks
G. Bingham
Risto Miikkulainen
ODL
24
4
0
18 Sep 2021
Beijing ZKJ-NPU Speaker Verification System for VoxCeleb Speaker Recognition Challenge 2021
Li Lyna Zhang
Huan Zhao
Qinling Meng
Yanli Chen
Min Liu
Lei Xie
22
10
0
08 Sep 2021
Design and Scaffolded Training of an Efficient DNN Operator for Computer Vision on the Edge
Vinod Ganesan
Pratyush Kumar
34
2
0
25 Aug 2021
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLM
MLLM
51
779
0
24 Aug 2021
Monte Carlo DropBlock for Modelling Uncertainty in Object Detection
K. Deepshikha
Sai Harsha Yelleni
P. K. Srijith
C.Krishna Mohan
BDL
UQCV
27
87
0
08 Aug 2021
COVID-19 Pneumonia Severity Prediction using Hybrid Convolution-Attention Neural Architectures
Nam H. Nguyen
Jerome Chang
14
3
0
06 Jul 2021
What Makes for Hierarchical Vision Transformer?
Yuxin Fang
Xinggang Wang
Rui Wu
Wenyu Liu
ViT
11
9
0
05 Jul 2021
Encoder-Decoder Architectures for Clinically Relevant Coronary Artery Segmentation
Joao Lourencco Silva
M. Menezes
T. Rodrigues
B. Silva
F. Pinto
Arlindo L. Oliveira
MedIm
18
17
0
21 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
34
613
0
18 Jun 2021
Scaling Vision Transformers
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
23
1,058
0
08 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
27
22
0
31 May 2021
Adversarial Robustness against Multiple and Single
l
p
l_p
l
p
-Threat Models via Quick Fine-Tuning of Robust Classifiers
Francesco Croce
Matthias Hein
OOD
AAML
20
18
0
26 May 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
109
209
0
26 Apr 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
266
3,622
0
24 Feb 2021
Conditional Positional Encodings for Vision Transformers
Xiangxiang Chu
Zhi Tian
Bo-Wen Zhang
Xinlong Wang
Chunhua Shen
ViT
20
602
0
22 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
265
179
0
17 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
270
979
0
27 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,428
0
04 Jan 2021
Previous
1
2
3
...
10
8
9