Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2106.04803
Cited By
v1
v2 (latest)
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Neural Information Processing Systems (NeurIPS), 2021
9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CoAtNet: Marrying Convolution and Attention for All Data Sizes"
50 / 510 papers shown
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Han Cai
Junyan Li
Muyan Hu
Chuang Gan
Song Han
338
83
0
29 May 2022
How Tempering Fixes Data Augmentation in Bayesian Neural Networks
International Conference on Machine Learning (ICML), 2022
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
BDL
AAML
262
11
0
27 May 2022
Fast Vision Transformers with HiLo Attention
Neural Information Processing Systems (NeurIPS), 2022
Zizheng Pan
Jianfei Cai
Bohan Zhuang
439
242
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Jiaming Song
293
79
0
26 May 2022
Inception Transformer
Neural Information Processing Systems (NeurIPS), 2022
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
335
253
0
25 May 2022
MoCoViT: Mobile Convolutional Vision Transformer
Hailong Ma
Xin Xia
Xing Wang
Xuefeng Xiao
Jiashi Li
Min Zheng
ViT
378
21
0
25 May 2022
Visualizing CoAtNet Predictions for Aiding Melanoma Detection
Engineering and Technology Journal (ETJ), 2022
Daniel Kvak
MedIm
186
3
0
21 May 2022
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors
Neural Information Processing Systems (NeurIPS), 2022
Ravid Shwartz-Ziv
Micah Goldblum
Hossein Souri
Sanyam Kapoor
Chen Zhu
Yann LeCun
A. Wilson
UQCV
BDL
162
46
0
20 May 2022
TRT-ViT: TensorRT-oriented Vision Transformer
Xin Xia
Jiashi Li
Jie Wu
Xing Wang
Xuefeng Xiao
Min Zheng
Rui Wang
ViT
216
34
0
19 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
Shiyang Feng
Teli Ma
Jiaming Song
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
256
150
0
08 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
667
1,596
0
04 May 2022
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
Proceedings of the VLDB Endowment (PVLDB), 2022
Zhen Zhang
Shuai Zheng
Yida Wang
Justin Chiu
George Karypis
Trishul Chilimbi
Mu Li
Xin Jin
442
47
0
30 Apr 2022
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Xianing Chen
Qiong Cao
Yujie Zhong
Jing Zhang
Shenghua Gao
Dacheng Tao
ViT
236
101
0
27 Apr 2022
TranSiam: Fusing Multimodal Visual Features Using Transformer for Medical Image Segmentation
Xia Li
Shiqiang Ma
Jijun Tang
Fei Guo
ViT
MedIm
79
12
0
26 Apr 2022
Investigating Neural Architectures by Synthetic Dataset Design
Adrien Courtois
Jean-Michel Morel
Pablo Arias
172
4
0
23 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
European Conference on Computer Vision (ECCV), 2022
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
218
65
0
18 Apr 2022
ResT V2: Simpler, Faster and Stronger
Neural Information Processing Systems (NeurIPS), 2022
Qing-Long Zhang
Yubin Yang
ViT
242
30
0
15 Apr 2022
Localization Distillation for Object Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Zhaohui Zheng
Rongguang Ye
Ping Wang
Dongwei Ren
Jun Wang
W. Zuo
Ming-Ming Cheng
215
79
0
12 Apr 2022
DaViT: Dual Attention Vision Transformers
European Conference on Computer Vision (ECCV), 2022
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
367
343
0
07 Apr 2022
Few-Shot Forecasting of Time-Series with Heterogeneous Channels
L. Brinkmeyer
Rafael Rêgo Drumond
Johannes Burchert
Lars Schmidt-Thieme
AI4TS
203
12
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
IEEE International Conference on Computer Vision (ICCV), 2022
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
237
66
0
06 Apr 2022
MaxViT: Multi-Axis Vision Transformer
European Conference on Computer Vision (ECCV), 2022
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
479
881
0
04 Apr 2022
Revisiting a kNN-based Image Classification System with High-capacity Storage
European Conference on Computer Vision (ECCV), 2022
K. Nakata
Youyang Ng
Daisuke Miyashita
A. Maki
Yu Lin
J. Deguchi
242
29
0
03 Apr 2022
InstaFormer: Instance-Aware Image-to-Image Translation with Transformer
Computer Vision and Pattern Recognition (CVPR), 2022
Soohyun Kim
Jongbeom Baek
Jihye Park
Gyeongnyeon Kim
Seung Wook Kim
ViT
301
58
0
30 Mar 2022
Affine Medical Image Registration with Coarse-to-Fine Vision Transformer
Computer Vision and Pattern Recognition (CVPR), 2022
Tony C. W. Mok
Albert C. S. Chung
ViT
MedIm
194
79
0
29 Mar 2022
Automated Progressive Learning for Efficient Training of Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
250
54
0
28 Mar 2022
DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation
Machine Intelligence Research (MIR), 2022
Zhenyu Li
Zehui Chen
Xianming Liu
Junjun Jiang
ViT
MDE
178
225
1
27 Mar 2022
On the link between conscious function and general intelligence in humans and machines
Arthur Juliani
Kai Arulkumaran
Shuntaro Sasai
Ryota Kanai
275
27
0
24 Mar 2022
Deep Frequency Filtering for Domain Generalization
Computer Vision and Pattern Recognition (CVPR), 2022
Shiqi Lin
Zhizheng Zhang
Zhipeng Huang
Yan Lu
Cuiling Lan
...
Jiang Wang
Zicheng Liu
Amey Parulkar
V. Navkal
Zhibo Chen
254
68
0
23 Mar 2022
Symmetry-Based Representations for Artificial and Biological General Intelligence
Frontiers in Computational Neuroscience (Front. Comput. Neurosci.), 2022
I. Higgins
S. Racanière
Danilo Jimenez Rezende
AI4CE
250
52
0
17 Mar 2022
Stubborn: A Strong Baseline for Indoor Object Navigation
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Haokuan Luo
Albert Yue
Zhang-Wei Hong
Pulkit Agrawal
280
57
0
14 Mar 2022
TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation
Journal of Visual Communication and Image Representation (JVCIR), 2022
Ruiwen Li
Zheda Mai
C. Trabelsi
Zhibo Zhang
Jongseong Jang
Scott Sanner
ViT
189
75
0
14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
176
32
0
13 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
International Conference on Machine Learning (ICML), 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
728
1,281
1
10 Mar 2022
ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer
European Conference on Computer Vision (ECCV), 2022
Haokui Zhang
Wenze Hu
Xiaoyu Wang
ViT
330
75
0
08 Mar 2022
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
Qishuai Diao
Yi Jiang
Bin Wen
Jianxiang Sun
Zehuan Yuan
168
67
0
05 Mar 2022
ViT-P: Rethinking Data-efficient Vision Transformers from Locality
Bin Chen
Ran A. Wang
Di Ming
Xin Feng
ViT
80
7
0
04 Mar 2022
Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions
IEEE International Conference on Consumer Electronics (ICCE), 2022
Ruikang Ju
Ting-Yu Lin
Jen-Shiun Chiang
Jia-Hao Jian
Yu-Shian Lin
Liu-Rui-Yi Huang
ViT
137
2
0
02 Mar 2022
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark
Yunhe Gao
Mu Zhou
Ding Liu
Zhennan Yan
Shaoting Zhang
Dimitris N. Metaxas
ViT
MedIm
697
95
0
28 Feb 2022
Hilbert Flattening: a Locality-Preserving Matrix Unfolding Method for Visual Discrimination
Qingsong Zhao
Shuguang Dou
Zhipeng Zhou
Yangguang Li
Yin Wang
Yu Qiao
Cairong Zhao
236
0
0
21 Feb 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
International Journal of Computer Vision (IJCV), 2022
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
275
272
0
21 Feb 2022
Visual Attention Network
Computational Visual Media (CVM), 2022
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
470
869
0
20 Feb 2022
Mixture-of-Experts with Expert Choice Routing
Neural Information Processing Systems (NeurIPS), 2022
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
602
554
0
18 Feb 2022
How Do Vision Transformers Work?
International Conference on Learning Representations (ICLR), 2022
Namuk Park
Songkuk Kim
ViT
465
599
0
14 Feb 2022
KENN: Enhancing Deep Neural Networks by Leveraging Knowledge for Time Series Forecasting
M. A. Chattha
L. V. Elst
M. I. Malik
Andreas Dengel
Sheraz Ahmed
AI4TS
266
0
0
08 Feb 2022
Towards an Analytical Definition of Sufficient Data
SN Computer Science (SN Comput. Sci.), 2022
Adam Byerly
T. Kalganova
180
5
0
07 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
International Conference on Machine Learning (ICML), 2022
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
517
1,006
0
07 Feb 2022
Learning strides in convolutional neural networks
International Conference on Learning Representations (ICLR), 2022
Rachid Riad
O. Teboul
David Grangier
Neil Zeghidour
159
49
0
03 Feb 2022
Architecture Matters in Continual Learning
Seyed Iman Mirzadeh
Arslan Chaudhry
Dong Yin
Timothy Nguyen
Razvan Pascanu
Dilan Görür
Mehrdad Farajtabar
OOD
KELM
369
63
0
01 Feb 2022
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
437
482
0
24 Jan 2022
Previous
1
2
3
...
10
11
8
9
Next