ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04803
  4. Cited By
CoAtNet: Marrying Convolution and Attention for All Data Sizes
v1v2 (latest)

CoAtNet: Marrying Convolution and Attention for All Data Sizes

Neural Information Processing Systems (NeurIPS), 2021
9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
    ViT
ArXiv (abs)PDFHTML

Papers citing "CoAtNet: Marrying Convolution and Attention for All Data Sizes"

50 / 510 papers shown
FastViT: A Fast Hybrid Vision Transformer using Structural
  Reparameterization
FastViT: A Fast Hybrid Vision Transformer using Structural ReparameterizationIEEE International Conference on Computer Vision (ICCV), 2023
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
ViT
336
288
0
24 Mar 2023
Enhancing Multiple Reliability Measures via Nuisance-extended
  Information Bottleneck
Enhancing Multiple Reliability Measures via Nuisance-extended Information BottleneckComputer Vision and Pattern Recognition (CVPR), 2023
Jongheon Jeong
Sihyun Yu
Hankook Lee
Jinwoo Shin
AAML
176
1
0
24 Mar 2023
Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
Exploiting Unlabelled Photos for Stronger Fine-Grained SBIRComputer Vision and Pattern Recognition (CVPR), 2023
Aneeshan Sain
A. Bhunia
Subhadeep Koley
Pinaki Nath Chowdhury
Soumitri Chattopadhyay
Tao Xiang
Yi-Zhe Song
288
25
0
24 Mar 2023
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic SegmentationComputer Vision and Pattern Recognition (CVPR), 2023
Seokju Cho
Heeseong Shin
Sung‐Jin Hong
Anurag Arnab
Paul Hongsuck Seo
Seung Wook Kim
VLM
355
181
0
21 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the
  Future
Large AI Models in Health Informatics: Applications, Challenges, and the FutureIEEE journal of biomedical and health informatics (IEEE JBHI), 2023
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny Lo
AI4MHLM&MA
280
184
0
21 Mar 2023
Convolutions, Transformers, and their Ensembles for the Segmentation of
  Organs at Risk in Radiation Treatment of Cervical Cancer
Convolutions, Transformers, and their Ensembles for the Segmentation of Organs at Risk in Radiation Treatment of Cervical Cancer
Vangelis Kostoulas
Peter A. N. Bosman
Tanja Alderliesten
UQCV
101
2
0
20 Mar 2023
Cross-Modal Causal Intervention for Medical Report Generation
Cross-Modal Causal Intervention for Medical Report GenerationIEEE Transactions on Image Processing (IEEE TIP), 2023
Weixing Chen
Yang-Yang Liu
Ce Wang
Jiarui Zhu
Shen Zhao
Guanbin Li
Cheng-Lin Liu
328
7
0
16 Mar 2023
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale
  Attention
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale AttentionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Wenxiao Wang
Wei Chen
Qibo Qiu
Long Chen
Boxi Wu
Binbin Lin
Xiaofei He
Wei Liu
215
93
0
13 Mar 2023
HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices
HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices
Lotfi Abdelkrim Mecharbat
Hadjer Benmeziane
Hamza Ouarnoughi
Smail Niar
ViT
126
6
0
08 Mar 2023
FFT-based Dynamic Token Mixer for Vision
FFT-based Dynamic Token Mixer for VisionAAAI Conference on Artificial Intelligence (AAAI), 2023
Yuki Tatsunami
Masato Taki
303
53
0
07 Mar 2023
Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Run, Don't Walk: Chasing Higher FLOPS for Faster Neural NetworksComputer Vision and Pattern Recognition (CVPR), 2023
Jierun Chen
Shiu-hong Kao
Hao He
Weipeng Zhuo
Song Wen
Chul-Ho Lee
Shueng-Han Gary Chan
OOD
344
1,486
0
07 Mar 2023
Pyramid Pixel Context Adaption Network for Medical Image Classification
  with Supervised Contrastive Learning
Pyramid Pixel Context Adaption Network for Medical Image Classification with Supervised Contrastive LearningIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Xiaoqin Zhang
Zunjie Xiao
Xiao Wu
Jiansheng Fang
Junyong Shen
Yan Hu
Jiang-Dong Liu
275
27
0
03 Mar 2023
Revisiting Adversarial Training for ImageNet: Architectures, Training
  and Generalization across Threat Models
Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat ModelsNeural Information Processing Systems (NeurIPS), 2023
Naman D. Singh
Francesco Croce
Matthias Hein
OOD
366
93
0
03 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
228
37
0
02 Mar 2023
Image as Set of Points
Image as Set of PointsInternational Conference on Learning Representations (ICLR), 2023
Xu Ma
Yuqian Zhou
Huan Wang
Can Qin
Bin Sun
Chang Liu
Yun Fu
VLM
186
65
0
02 Mar 2023
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
  Collaborative AutoML System
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System
Chao Xue
Wen Liu
Shunxing Xie
Zhenfang Wang
Jiaxing Li
...
Shi-Yong Chen
Yibing Zhan
Jing Zhang
Chaoyue Wang
Dacheng Tao
232
2
0
01 Mar 2023
Extracting Motion and Appearance via Inter-Frame Attention for Efficient
  Video Frame Interpolation
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame InterpolationComputer Vision and Pattern Recognition (CVPR), 2023
Guozhen Zhang
Yuhan Zhu
Hongya Wang
Youxin Chen
Gangshan Wu
Limin Wang
204
135
0
01 Mar 2023
Teaching CLIP to Count to Ten
Teaching CLIP to Count to TenIEEE International Conference on Computer Vision (ICCV), 2023
Roni Paiss
Ariel Ephrat
Omer Tov
Shiran Zada
Inbar Mosseri
Michal Irani
Tali Dekel
VLMCLIP
472
160
0
23 Feb 2023
Deep Active Learning in the Presence of Label Noise: A Survey
Deep Active Learning in the Presence of Label Noise: A Survey
Moseli Motsóehli
Kyungim Baek
NoLaVLM
282
5
0
22 Feb 2023
Device Tuning for Multi-Task Large Model
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
92
0
0
21 Feb 2023
PolyFormer: Referring Image Segmentation as Sequential Polygon
  Generation
PolyFormer: Referring Image Segmentation as Sequential Polygon GenerationComputer Vision and Pattern Recognition (CVPR), 2023
Jiang Liu
Hui Ding
Zhaowei Cai
Yuting Zhang
R. Satzoda
Vijay Mahadevan
R. Manmatha
ObjD
307
180
0
14 Feb 2023
Symbolic Discovery of Optimization Algorithms
Symbolic Discovery of Optimization AlgorithmsNeural Information Processing Systems (NeurIPS), 2023
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
774
513
0
13 Feb 2023
A Unified View of Long-Sequence Models towards Modeling Million-Scale
  Dependencies
A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies
Hongyu Hè
Marko Kabić
273
2
0
13 Feb 2023
Quantum Neuron Selection: Finding High Performing Subnetworks With
  Quantum Algorithms
Quantum Neuron Selection: Finding High Performing Subnetworks With Quantum Algorithms
Tim Whitaker
177
3
0
12 Feb 2023
Team Triple-Check at Factify 2: Parameter-Efficient Large Foundation
  Models with Feature Representations for Multi-Modal Fact Verification
Team Triple-Check at Factify 2: Parameter-Efficient Large Foundation Models with Feature Representations for Multi-Modal Fact Verification
Wei-Wei Du
Hongfa Wu
Wei-Yao Wang
Chao-Han Huck Yang
170
9
0
12 Feb 2023
Knowledge Distillation in Vision Transformers: A Critical Review
Knowledge Distillation in Vision Transformers: A Critical Review
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
288
23
0
04 Feb 2023
DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition
DilateFormer: Multi-Scale Dilated Transformer for Visual RecognitionIEEE transactions on multimedia (IEEE TMM), 2023
Jiayu Jiao
Yuyao Tang
Kun-Li Channing Lin
Yipeng Gao
Jinhua Ma
Yaowei Wang
Wei-Shi Zheng
MedImViT
246
249
0
03 Feb 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly
  Communication-Efficient
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-EfficientInternational Conference on Machine Learning (ICML), 2023
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
356
55
0
27 Jan 2023
Progressive Meta-Pooling Learning for Lightweight Image Classification
  Model
Progressive Meta-Pooling Learning for Lightweight Image Classification ModelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Peijie Dong
Xin-Yi Niu
Zhiliang Tian
Lujun Li
Xiaodong Wang
Zimian Wei
H. Pan
Dongsheng Li
VLM
170
8
0
24 Jan 2023
A Structural Approach to the Design of Domain Specific Neural Network
  Architectures
A Structural Approach to the Design of Domain Specific Neural Network Architectures
Gerrit Nolte
145
0
0
23 Jan 2023
Efficient Activation Function Optimization through Surrogate Modeling
Efficient Activation Function Optimization through Surrogate ModelingNeural Information Processing Systems (NeurIPS), 2023
G. Bingham
Risto Miikkulainen
374
7
0
13 Jan 2023
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked AutoencodersComputer Vision and Pattern Recognition (CVPR), 2023
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
SyDa
424
1,289
0
02 Jan 2023
Exploring Vision Transformers as Diffusion Learners
Exploring Vision Transformers as Diffusion Learners
He Cao
Jianan Wang
Tianhe Ren
Xianbiao Qi
Yihao Chen
Xingtai Lv
Guang Dai
164
11
0
28 Dec 2022
Representation Separation for Semantic Segmentation with Vision
  Transformers
Representation Separation for Semantic Segmentation with Vision Transformers
Yuanduo Hong
Huihui Pan
Weichao Sun
Xinghu Yu
Huijun Gao
ViT
179
5
0
28 Dec 2022
MixupE: Understanding and Improving Mixup from Directional Derivative
  Perspective
MixupE: Understanding and Improving Mixup from Directional Derivative PerspectiveConference on Uncertainty in Artificial Intelligence (UAI), 2022
Yingtian Zou
Vikas Verma
Sarthak Mittal
Wai Hoh Tang
Hieu H. Pham
Arno Solin
Yoshua Bengio
Arno Solin
Kenji Kawaguchi
UQCV
560
10
0
27 Dec 2022
A Close Look at Spatial Modeling: From Attention to Convolution
A Close Look at Spatial Modeling: From Attention to Convolution
Xu Ma
Huan Wang
Can Qin
Kunpeng Li
Xing Zhao
Jie Fu
Yun Fu
ViT3DPC
158
13
0
23 Dec 2022
What Makes for Good Tokenizers in Vision Transformer?
What Makes for Good Tokenizers in Vision Transformer?IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Shengju Qian
Yi Zhu
Wenbo Li
Mu Li
Jiaya Jia
ViT
219
17
0
21 Dec 2022
Universal Object Detection with Large Vision Model
Universal Object Detection with Large Vision ModelInternational Journal of Computer Vision (IJCV), 2022
Feng-Huei Lin
Wenze Hu
Yaowei Wang
Yonghong Tian
Guangming Lu
Fanglin Chen
Yong-mei Xu
Xiaoyu Wang
VLMObjD
281
9
0
19 Dec 2022
Rethinking Vision Transformers for MobileNet Size and Speed
Rethinking Vision Transformers for MobileNet Size and SpeedIEEE International Conference on Computer Vision (ICCV), 2022
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
368
260
0
15 Dec 2022
Comparing the Decision-Making Mechanisms by Transformers and CNNs via
  Explanation Methods
Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation MethodsComputer Vision and Pattern Recognition (CVPR), 2022
Ming-Xiu Jiang
Saeed Khorram
Li Fuxin
FAtt
464
15
0
13 Dec 2022
What do Vision Transformers Learn? A Visual Exploration
What do Vision Transformers Learn? A Visual Exploration
Amin Ghiasi
Hamid Kazemi
Eitan Borgnia
Steven Reich
Manli Shu
Micah Goldblum
A. Wilson
Tom Goldstein
ViT
259
79
0
13 Dec 2022
OAMixer: Object-aware Mixing Layer for Vision Transformers
OAMixer: Object-aware Mixing Layer for Vision Transformers
H. Kang
Sangwoo Mo
Jinwoo Shin
VLM
260
5
0
13 Dec 2022
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive
  Learning
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive LearningComputer Vision and Pattern Recognition (CVPR), 2022
Jishnu Mukhoti
Tsung-Yu Lin
Omid Poursaeed
Rui Wang
Ashish Shah
Juil Sock
Ser-Nam Lim
VLM
263
117
0
09 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
Deep Incubation: Training Large Models by Divide-and-ConqueringIEEE International Conference on Computer Vision (ICCV), 2022
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
239
13
0
08 Dec 2022
MixBoost: Improving the Robustness of Deep Neural Networks by Boosting
  Data Augmentation
MixBoost: Improving the Robustness of Deep Neural Networks by Boosting Data Augmentation
Zhendong Liu
Wenyu Jiang
Min Guo
Chongjun Wang
AAML
253
1
0
08 Dec 2022
Lightweight Structure-Aware Attention for Visual Understanding
Lightweight Structure-Aware Attention for Visual UnderstandingInternational Journal of Computer Vision (IJCV), 2022
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
197
3
0
29 Nov 2022
Minimal Width for Universal Property of Deep RNN
Minimal Width for Universal Property of Deep RNNJournal of machine learning research (JMLR), 2022
Changhoon Song
Geonho Hwang
Jun ho Lee
Myung-joo Kang
178
13
0
25 Nov 2022
Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token
  Migration
Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token MigrationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yunjie Tian
Lingxi Xie
Jihao Qiu
Jianbin Jiao
Yaowei Wang
Qi Tian
Qixiang Ye
ViT
196
19
0
23 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
230
214
0
22 Nov 2022
Vision Transformer with Super Token Sampling
Vision Transformer with Super Token Sampling
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Xiao-Yu Zhang
Tieniu Tan
ViT
281
96
0
21 Nov 2022
Previous
123...567...91011
Next