Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2106.04803
Cited By
v1
v2 (latest)
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Neural Information Processing Systems (NeurIPS), 2021
9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CoAtNet: Marrying Convolution and Attention for All Data Sizes"
50 / 510 papers shown
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
IEEE International Conference on Computer Vision (ICCV), 2023
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
ViT
336
288
0
24 Mar 2023
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck
Computer Vision and Pattern Recognition (CVPR), 2023
Jongheon Jeong
Sihyun Yu
Hankook Lee
Jinwoo Shin
AAML
176
1
0
24 Mar 2023
Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
Computer Vision and Pattern Recognition (CVPR), 2023
Aneeshan Sain
A. Bhunia
Subhadeep Koley
Pinaki Nath Chowdhury
Soumitri Chattopadhyay
Tao Xiang
Yi-Zhe Song
288
25
0
24 Mar 2023
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
Computer Vision and Pattern Recognition (CVPR), 2023
Seokju Cho
Heeseong Shin
Sung‐Jin Hong
Anurag Arnab
Paul Hongsuck Seo
Seung Wook Kim
VLM
355
181
0
21 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the Future
IEEE journal of biomedical and health informatics (IEEE JBHI), 2023
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny Lo
AI4MH
LM&MA
280
184
0
21 Mar 2023
Convolutions, Transformers, and their Ensembles for the Segmentation of Organs at Risk in Radiation Treatment of Cervical Cancer
Vangelis Kostoulas
Peter A. N. Bosman
Tanja Alderliesten
UQCV
101
2
0
20 Mar 2023
Cross-Modal Causal Intervention for Medical Report Generation
IEEE Transactions on Image Processing (IEEE TIP), 2023
Weixing Chen
Yang-Yang Liu
Ce Wang
Jiarui Zhu
Shen Zhao
Guanbin Li
Cheng-Lin Liu
328
7
0
16 Mar 2023
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale Attention
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Wenxiao Wang
Wei Chen
Qibo Qiu
Long Chen
Boxi Wu
Binbin Lin
Xiaofei He
Wei Liu
215
93
0
13 Mar 2023
HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices
Lotfi Abdelkrim Mecharbat
Hadjer Benmeziane
Hamza Ouarnoughi
Smail Niar
ViT
126
6
0
08 Mar 2023
FFT-based Dynamic Token Mixer for Vision
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yuki Tatsunami
Masato Taki
303
53
0
07 Mar 2023
Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Computer Vision and Pattern Recognition (CVPR), 2023
Jierun Chen
Shiu-hong Kao
Hao He
Weipeng Zhuo
Song Wen
Chul-Ho Lee
Shueng-Han Gary Chan
OOD
344
1,486
0
07 Mar 2023
Pyramid Pixel Context Adaption Network for Medical Image Classification with Supervised Contrastive Learning
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Xiaoqin Zhang
Zunjie Xiao
Xiao Wu
Jiansheng Fang
Junyong Shen
Yan Hu
Jiang-Dong Liu
275
27
0
03 Mar 2023
Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models
Neural Information Processing Systems (NeurIPS), 2023
Naman D. Singh
Francesco Croce
Matthias Hein
OOD
366
93
0
03 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
228
37
0
02 Mar 2023
Image as Set of Points
International Conference on Learning Representations (ICLR), 2023
Xu Ma
Yuqian Zhou
Huan Wang
Can Qin
Bin Sun
Chang Liu
Yun Fu
VLM
186
65
0
02 Mar 2023
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System
Chao Xue
Wen Liu
Shunxing Xie
Zhenfang Wang
Jiaxing Li
...
Shi-Yong Chen
Yibing Zhan
Jing Zhang
Chaoyue Wang
Dacheng Tao
232
2
0
01 Mar 2023
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
Computer Vision and Pattern Recognition (CVPR), 2023
Guozhen Zhang
Yuhan Zhu
Hongya Wang
Youxin Chen
Gangshan Wu
Limin Wang
204
135
0
01 Mar 2023
Teaching CLIP to Count to Ten
IEEE International Conference on Computer Vision (ICCV), 2023
Roni Paiss
Ariel Ephrat
Omer Tov
Shiran Zada
Inbar Mosseri
Michal Irani
Tali Dekel
VLM
CLIP
472
160
0
23 Feb 2023
Deep Active Learning in the Presence of Label Noise: A Survey
Moseli Motsóehli
Kyungim Baek
NoLa
VLM
282
5
0
22 Feb 2023
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
92
0
0
21 Feb 2023
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Computer Vision and Pattern Recognition (CVPR), 2023
Jiang Liu
Hui Ding
Zhaowei Cai
Yuting Zhang
R. Satzoda
Vijay Mahadevan
R. Manmatha
ObjD
307
180
0
14 Feb 2023
Symbolic Discovery of Optimization Algorithms
Neural Information Processing Systems (NeurIPS), 2023
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
774
513
0
13 Feb 2023
A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies
Hongyu Hè
Marko Kabić
273
2
0
13 Feb 2023
Quantum Neuron Selection: Finding High Performing Subnetworks With Quantum Algorithms
Tim Whitaker
177
3
0
12 Feb 2023
Team Triple-Check at Factify 2: Parameter-Efficient Large Foundation Models with Feature Representations for Multi-Modal Fact Verification
Wei-Wei Du
Hongfa Wu
Wei-Yao Wang
Chao-Han Huck Yang
170
9
0
12 Feb 2023
Knowledge Distillation in Vision Transformers: A Critical Review
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
288
23
0
04 Feb 2023
DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition
IEEE transactions on multimedia (IEEE TMM), 2023
Jiayu Jiao
Yuyao Tang
Kun-Li Channing Lin
Yipeng Gao
Jinhua Ma
Yaowei Wang
Wei-Shi Zheng
MedIm
ViT
246
249
0
03 Feb 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
International Conference on Machine Learning (ICML), 2023
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
356
55
0
27 Jan 2023
Progressive Meta-Pooling Learning for Lightweight Image Classification Model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Peijie Dong
Xin-Yi Niu
Zhiliang Tian
Lujun Li
Xiaodong Wang
Zimian Wei
H. Pan
Dongsheng Li
VLM
170
8
0
24 Jan 2023
A Structural Approach to the Design of Domain Specific Neural Network Architectures
Gerrit Nolte
145
0
0
23 Jan 2023
Efficient Activation Function Optimization through Surrogate Modeling
Neural Information Processing Systems (NeurIPS), 2023
G. Bingham
Risto Miikkulainen
374
7
0
13 Jan 2023
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Computer Vision and Pattern Recognition (CVPR), 2023
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
SyDa
424
1,289
0
02 Jan 2023
Exploring Vision Transformers as Diffusion Learners
He Cao
Jianan Wang
Tianhe Ren
Xianbiao Qi
Yihao Chen
Xingtai Lv
Guang Dai
164
11
0
28 Dec 2022
Representation Separation for Semantic Segmentation with Vision Transformers
Yuanduo Hong
Huihui Pan
Weichao Sun
Xinghu Yu
Huijun Gao
ViT
179
5
0
28 Dec 2022
MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
Conference on Uncertainty in Artificial Intelligence (UAI), 2022
Yingtian Zou
Vikas Verma
Sarthak Mittal
Wai Hoh Tang
Hieu H. Pham
Arno Solin
Yoshua Bengio
Arno Solin
Kenji Kawaguchi
UQCV
560
10
0
27 Dec 2022
A Close Look at Spatial Modeling: From Attention to Convolution
Xu Ma
Huan Wang
Can Qin
Kunpeng Li
Xing Zhao
Jie Fu
Yun Fu
ViT
3DPC
158
13
0
23 Dec 2022
What Makes for Good Tokenizers in Vision Transformer?
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Shengju Qian
Yi Zhu
Wenbo Li
Mu Li
Jiaya Jia
ViT
219
17
0
21 Dec 2022
Universal Object Detection with Large Vision Model
International Journal of Computer Vision (IJCV), 2022
Feng-Huei Lin
Wenze Hu
Yaowei Wang
Yonghong Tian
Guangming Lu
Fanglin Chen
Yong-mei Xu
Xiaoyu Wang
VLM
ObjD
281
9
0
19 Dec 2022
Rethinking Vision Transformers for MobileNet Size and Speed
IEEE International Conference on Computer Vision (ICCV), 2022
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
368
260
0
15 Dec 2022
Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods
Computer Vision and Pattern Recognition (CVPR), 2022
Ming-Xiu Jiang
Saeed Khorram
Li Fuxin
FAtt
464
15
0
13 Dec 2022
What do Vision Transformers Learn? A Visual Exploration
Amin Ghiasi
Hamid Kazemi
Eitan Borgnia
Steven Reich
Manli Shu
Micah Goldblum
A. Wilson
Tom Goldstein
ViT
259
79
0
13 Dec 2022
OAMixer: Object-aware Mixing Layer for Vision Transformers
H. Kang
Sangwoo Mo
Jinwoo Shin
VLM
260
5
0
13 Dec 2022
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Computer Vision and Pattern Recognition (CVPR), 2022
Jishnu Mukhoti
Tsung-Yu Lin
Omid Poursaeed
Rui Wang
Ashish Shah
Juil Sock
Ser-Nam Lim
VLM
263
117
0
09 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
IEEE International Conference on Computer Vision (ICCV), 2022
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
239
13
0
08 Dec 2022
MixBoost: Improving the Robustness of Deep Neural Networks by Boosting Data Augmentation
Zhendong Liu
Wenyu Jiang
Min Guo
Chongjun Wang
AAML
253
1
0
08 Dec 2022
Lightweight Structure-Aware Attention for Visual Understanding
International Journal of Computer Vision (IJCV), 2022
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
197
3
0
29 Nov 2022
Minimal Width for Universal Property of Deep RNN
Journal of machine learning research (JMLR), 2022
Changhoon Song
Geonho Hwang
Jun ho Lee
Myung-joo Kang
178
13
0
25 Nov 2022
Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token Migration
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yunjie Tian
Lingxi Xie
Jihao Qiu
Jianbin Jiao
Yaowei Wang
Qi Tian
Qixiang Ye
ViT
196
19
0
23 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
230
214
0
22 Nov 2022
Vision Transformer with Super Token Sampling
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Xiao-Yu Zhang
Tieniu Tan
ViT
281
96
0
21 Nov 2022
Previous
1
2
3
...
5
6
7
...
9
10
11
Next