Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2106.04803
Cited By
v1
v2 (latest)
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Neural Information Processing Systems (NeurIPS), 2021
9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CoAtNet: Marrying Convolution and Attention for All Data Sizes"
50 / 510 papers shown
Title
Sparse Double Descent in Vision Transformers: real or phantom threat?
International Conference on Image Analysis and Processing (ICIAP), 2023
Victor Quétu
Marta Milovanović
Enzo Tartaglione
293
2
0
26 Jul 2023
Adaptive Frequency Filters As Efficient Global Token Mixers
IEEE International Conference on Computer Vision (ICCV), 2023
Zhipeng Huang
Zhizheng Zhang
Cuiling Lan
Zhengjun Zha
Yan Lu
B. Guo
174
77
0
26 Jul 2023
ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features
CASE (CASE), 2023
Umitcan Sahin
Izzet Emre Kucukkaya
Oguzhan Ozcelik
Cagri Toraman
134
13
0
25 Jul 2023
Robust face anti-spoofing framework with Convolutional Vision Transformer
International Conference on Information Photonics (ICIP), 2023
Yunseung Lee
Youngjun Kwak
Jinho Shin
CVBM
ViT
153
6
0
24 Jul 2023
Bone mineral density estimation from a plain X-ray image by learning decomposition into projections of bone-segmented computed tomography
Yidong Gu
Yoshito Otake
Keisuke Uemura
Mazen Soufi
Masaki Takao
Hugues Talbot
S. Okada
Nobuhiko Sugano
Yoshinobu Sato
OOD
137
14
0
21 Jul 2023
Meta-Transformer: A Unified Framework for Multimodal Learning
Yiyuan Zhang
Kaixiong Gong
Kaipeng Zhang
Jiaming Song
Yu Qiao
Wanli Ouyang
Xiangyu Yue
172
181
0
20 Jul 2023
RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching Detection
ACM Multimedia (ACM MM), 2023
Qichao Ying
Jiaxin Liu
Sheng Li
Haisheng Xu
Zhenxing Qian
Xinpeng Zhang
CVBM
188
13
0
20 Jul 2023
RepViT: Revisiting Mobile CNN From ViT Perspective
Computer Vision and Pattern Recognition (CVPR), 2023
Ao Wang
Hui Chen
Zijia Lin
Hengjun Pu
Guiguang Ding
414
421
0
18 Jul 2023
Scale-Aware Modulation Meet Transformer
IEEE International Conference on Computer Vision (ICCV), 2023
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
254
125
0
17 Jul 2023
Vision Language Transformers: A Survey
Clayton Fields
C. Kennington
VLM
154
7
0
06 Jul 2023
EgoCOL: Egocentric Camera pose estimation for Open-world 3D object Localization @Ego4D challenge 2023
Cristhian Forigua
María Escobar
Jordi Pont-Tuset
Kevis-Kokitsi Maninis
Pablo Arbelaez
EgoV
234
2
0
29 Jun 2023
Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging
IEEE International Conference on Computer Vision (ICCV), 2023
Siming Zheng
Xin Yuan
ViT
MedIm
113
6
0
20 Jun 2023
Reviving Shift Equivariance in Vision Transformers
Peijian Ding
Davit Soselia
Thomas Armstrong
Jiahao Su
Furong Huang
217
10
0
13 Jun 2023
2-D SSM: A General Spatial Layer for Visual Transformers
Ethan Baron
Itamar Zimerman
Lior Wolf
184
22
0
11 Jun 2023
FasterViT: Fast Vision Transformers with Hierarchical Attention
International Conference on Learning Representations (ICLR), 2023
Ali Hatamizadeh
Greg Heinrich
Hongxu Yin
Andrew Tao
J. Álvarez
Jan Kautz
Pavlo Molchanov
ViT
339
106
0
09 Jun 2023
Multi-Architecture Multi-Expert Diffusion Models
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yunsung Lee
Jin-Young Kim
Hyojun Go
Myeongho Jeong
Shinhyeok Oh
Seungtaek Choi
DiffM
310
37
0
08 Jun 2023
DeltaNN: Assessing the Impact of Computational Environment Parameters on the Performance of Image Recognition Models
IEEE International Conference on Software Maintenance and Evolution (ICSME), 2023
Nikolaos Louloudakis
Perry Gibson
José Cano
A. Rajan
345
8
0
05 Jun 2023
GAT-GAN : A Graph-Attention-based Time-Series Generative Adversarial Network
Srikrishna Iyer
Teck-Hou Teng
AI4TS
115
3
0
03 Jun 2023
Brainformers: Trading Simplicity for Efficiency
International Conference on Machine Learning (ICML), 2023
Yan-Quan Zhou
Nan Du
Yanping Huang
Daiyi Peng
Chang Lan
...
Zhifeng Chen
Quoc V. Le
Claire Cui
J.H.J. Laundon
J. Dean
MoE
215
35
0
29 May 2023
Manifold Regularization for Memory-Efficient Training of Deep Neural Networks
Shadi Sartipi
Edgar A. Bernal
107
0
0
26 May 2023
BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks
Nature Network Boston (NNB), 2023
Kai Zhang
Jun Yu
Eashan Adhikarla
Rong Zhou
Zhilin Yan
...
Hang Zhang
Yong Chen
Shijie Zhao
Hongfang Liu
Lichao Sun
LM&MA
MedIm
236
11
0
26 May 2023
Improving Position Encoding of Transformers for Multivariate Time Series Classification
Data mining and knowledge discovery (DMKD), 2023
Navid Mohammadi Foumani
Chang Wei Tan
Geoffrey I. Webb
Mahsa Salehi
AI4TS
188
136
0
26 May 2023
ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers
Information Fusion (Inf. Fusion), 2023
J. Yao
Xinggang Wang
Shusheng Yang
Baoyuan Wang
ViT
218
85
0
24 May 2023
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
216
2
0
24 May 2023
Evolution: A Unified Formula for Feature Operators from a High-level Perspective
Zhicheng Cai
116
2
0
23 May 2023
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Neural Information Processing Systems (NeurIPS), 2023
Ibrahim Alabdulmohsin
Xiaohua Zhai
Alexander Kolesnikov
Lucas Beyer
VLM
534
85
0
22 May 2023
Low-Earth Satellite Orbit Determination Using Deep Convolutional Networks with Satellite Imagery
Rohit Khorana
49
1
0
20 May 2023
GELU Activation Function in Deep Learning: A Comprehensive Mathematical Analysis and Performance
Minhyeok Lee
152
49
0
20 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
467
151
0
18 May 2023
Mimetic Initialization of Self-Attention Layers
International Conference on Machine Learning (ICML), 2023
Asher Trockman
J. Zico Kolter
206
43
0
16 May 2023
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
Abdul Rehman Khan
Asifullah Khan
ViT
MedIm
293
19
0
15 May 2023
OneCAD: One Classifier for All image Datasets using multimodal learning
S. Wadekar
Eugenio Culurciello
269
0
0
11 May 2023
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
Computer Vision and Pattern Recognition (CVPR), 2023
Xinyu Liu
Houwen Peng
Ningxin Zheng
Yuqing Yang
Han Hu
Yixuan Yuan
ViT
209
549
0
11 May 2023
Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts
Zhaoyang Zhang
Yantao Shen
Kunyu Shi
Zhaowei Cai
Jun Fang
Siqi Deng
Hao Yang
Davide Modolo
Zhuowen Tu
Stefano Soatto
VLM
202
3
0
11 May 2023
A Survey on the Robustness of Computer Vision Models against Common Corruptions
Shunxin Wang
Raymond N. J. Veldhuis
Christoph Brune
N. Strisciuglio
OOD
VLM
531
22
0
10 May 2023
InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Zhaoyang Liu
Yinan He
Wenhai Wang
Weiyun Wang
Yi Wang
...
Yali Wang
Limin Wang
Ping Luo
Jifeng Dai
Yu Qiao
LRM
MLLM
336
105
0
09 May 2023
What Do Self-Supervised Vision Transformers Learn?
International Conference on Learning Representations (ICLR), 2023
Namuk Park
Wonjae Kim
Byeongho Heo
Taekyung Kim
Sangdoo Yun
SSL
269
102
1
01 May 2023
Vision Conformer: Incorporating Convolutions into Vision Transformer Layers
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2023
Brian Kenji Iwana
Akihiro Kusuda
ViT
155
2
0
27 Apr 2023
Omni Aggregation Networks for Lightweight Image Super-Resolution
Computer Vision and Pattern Recognition (CVPR), 2023
Hang Wang
Xuanhong Chen
Bingbing Ni
Yutian Liu
Jinfan Liu
SupR
195
120
0
20 Apr 2023
CAD-RADS scoring of coronary CT angiography with Multi-Axis Vision Transformer: a clinically-inspired deep learning pipeline
Alessia Gerbasi
A. Dagliati
Giuseppe Albi
M. Chiesa
D. Andreini
A. Baggiano
S. Mushtaq
G. Pontone
Riccardo Bellazzi
G. Colombo
MedIm
91
8
0
14 Apr 2023
Dynamic Mobile-Former: Strengthening Dynamic Convolution with Attention and Residual Connection in Kernel Space
Seokju Yun
Youngmin Ro
ViT
130
2
0
13 Apr 2023
RSIR Transformer: Hierarchical Vision Transformer using Random Sampling Windows and Important Region Windows
Zhemin Zhang
Xun Gong
ViT
114
1
0
13 Apr 2023
ViT-Calibrator: Decision Stream Calibration for Vision Transformer
AAAI Conference on Artificial Intelligence (AAAI), 2023
Lin Chen
Zhijie Jia
Tian Qiu
Lechao Cheng
Jie Lei
Zunlei Feng
Min-Gyoo Song
283
3
0
10 Apr 2023
PSLT: A Light-weight Vision Transformer with Ladder Self-Attention and Progressive Shift
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Gaojie Wu
Weishi Zheng
Yutong Lu
Q. Tian
ViT
187
22
0
07 Apr 2023
MULLER: Multilayer Laplacian Resizer for Vision
IEEE International Conference on Computer Vision (ICCV), 2023
Zhengzhong Tu
P. Milanfar
Hossein Talebi
190
7
0
06 Apr 2023
Neuroevolution of Recurrent Architectures on Control Tasks
Maximilien Le Clei
Pierre C. Bellec
65
5
0
03 Apr 2023
Astroformer: More Data Might not be all you need for Classification
Rishit Dagli
341
11
0
03 Apr 2023
Anatomically aware dual-hop learning for pulmonary embolism detection in CT pulmonary angiograms
Florin Condrea
S. Rapaka
Lucian Itu
Puneet Sharma
J. Sperl
Mohamed Ali
Marius Leordeanu
129
8
0
30 Mar 2023
Vision Transformer with Quadrangle Attention
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Qiming Zhang
Jing Zhang
Yufei Xu
Dacheng Tao
ViT
162
58
0
27 Mar 2023
Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA)
U. Nazir
W. Islam
M. Taj
223
3
0
25 Mar 2023
Previous
1
2
3
4
5
6
...
9
10
11
Next