ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04803
  4. Cited By
CoAtNet: Marrying Convolution and Attention for All Data Sizes
v1v2 (latest)

CoAtNet: Marrying Convolution and Attention for All Data Sizes

Neural Information Processing Systems (NeurIPS), 2021
9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
    ViT
ArXiv (abs)PDFHTML

Papers citing "CoAtNet: Marrying Convolution and Attention for All Data Sizes"

50 / 510 papers shown
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space ModelInternational Conference on Machine Learning (ICML), 2024
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
478
1,354
0
17 Jan 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Yaoyao Liu
Cihang Xie
ViTMDE
276
13
0
05 Jan 2024
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using
  Neural ODE
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using Neural ODE
Ikumi Okubo
Keisuke Sugiura
Hiroki Matsutani
240
2
0
05 Jan 2024
A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human
  Interaction Recognition
A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction RecognitionChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Ruoqi Yin
Jianqin Yin
ViT
195
8
0
31 Dec 2023
Heterogeneous Encoders Scaling In The Transformer For Neural Machine
  Translation
Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation
J. Hu
Roberto Cavicchioli
Giulia Berardinelli
Alessandro Capotondi
196
3
0
26 Dec 2023
Transformer-Based Multi-Object Smoothing with Decoupled Data Association
  and Smoothing
Transformer-Based Multi-Object Smoothing with Decoupled Data Association and Smoothing
Juliano Pinto
Georg Hess
Yuxuan Xia
H. Wymeersch
Lennart Svensson
VOT
202
6
0
22 Dec 2023
Delving Deeper Into Astromorphic Transformers
Delving Deeper Into Astromorphic Transformers
Md. Zesun Ahmed Mia
Malyaban Bal
Abhronil Sengupta
493
2
0
18 Dec 2023
ADF & TransApp: A Transformer-Based Framework for Appliance Detection
  Using Smart Meter Consumption Series
ADF & TransApp: A Transformer-Based Framework for Appliance Detection Using Smart Meter Consumption Series
Adrien Petralia
Philippe Charpentier
Themis Palpanas
AI4TS
227
6
0
17 Dec 2023
Factorization Vision Transformer: Modeling Long Range Dependency with
  Local Window Cost
Factorization Vision Transformer: Modeling Long Range Dependency with Local Window CostIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Haolin Qin
Daquan Zhou
Tingfa Xu
Ziyang Bian
Jianan Li
218
14
0
14 Dec 2023
A Novel Image Classification Framework Based on Variational Quantum
  Algorithms
A Novel Image Classification Framework Based on Variational Quantum AlgorithmsQuantum Information Processing (QIP), 2023
Yixiong Chen
259
10
0
13 Dec 2023
MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Abdullah Rashwan
Jiageng Zhang
A. Taalimi
Fan Yang
Xingyi Zhou
Chaochao Yan
Liang-Chieh Chen
Yeqing Li
ViT
318
8
0
11 Dec 2023
Activating Frequency and ViT for 3D Point Cloud Quality Assessment
  without Reference
Activating Frequency and ViT for 3D Point Cloud Quality Assessment without Reference
Oussama Messai
Abdelouahid Bentamou
Abbass Zein-Eddine
Yann Gavet
3DPC
171
5
0
10 Dec 2023
Rejuvenating image-GPT as Strong Visual Representation Learners
Rejuvenating image-GPT as Strong Visual Representation LearnersInternational Conference on Machine Learning (ICML), 2023
Sucheng Ren
Zeyu Wang
Hongru Zhu
Junfei Xiao
Yaoyao Liu
Cihang Xie
VLM
283
11
0
04 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
798
1
0
01 Dec 2023
Cell Maps Representation For Lung Adenocarcinoma Growth Patterns
  Classification In Whole Slide Images
Cell Maps Representation For Lung Adenocarcinoma Growth Patterns Classification In Whole Slide ImagesIEEE International Symposium on Biomedical Imaging (ISBI), 2023
Arwa Al-Rubaian
G. N. Gunesli
W. Althakfi
A. Azam
Nasir M. Rajpoot
S. Raza
MedIm
182
1
0
27 Nov 2023
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio,
  Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image RecognitionComputer Vision and Pattern Recognition (CVPR), 2023
Xiaohan Ding
Yiyuan Zhang
Yixiao Ge
Sijie Zhao
Lin Song
Xiangyu Yue
Ying Shan
VLMAI4TSSSL
258
224
0
27 Nov 2023
Deep Tensor Network
Deep Tensor Network
Yifan Zhang
367
0
0
18 Nov 2023
SynthEnsemble: A Fusion of CNN, Vision Transformer, and Hybrid Models
  for Multi-Label Chest X-Ray Classification
SynthEnsemble: A Fusion of CNN, Vision Transformer, and Hybrid Models for Multi-Label Chest X-Ray Classification
S. M. N. Ashraf
Md. Adyelullahil Mamun
Hasnat Md. Abdullah
Rabiul Alam
ViTMedIm
260
15
0
13 Nov 2023
Dual input stream transformer for vertical drift correction in
  eye-tracking reading data
Dual input stream transformer for vertical drift correction in eye-tracking reading dataIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Thomas M. Mercier
Marcin Budka
Martin R. Vasilev
Julie A. Kirkby
Bernhard Angele
T. Slattery
195
5
0
10 Nov 2023
Vision Big Bird: Random Sparsification for Full Attention
Vision Big Bird: Random Sparsification for Full Attention
Zhemin Zhang
Xun Gong
ViT
163
1
0
10 Nov 2023
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
GTP-ViT: Efficient Vision Transformers via Graph-based Token PropagationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Xuwei Xu
Sen Wang
Yudong Chen
Yanping Zheng
Zhewei Wei
Jiajun Liu
ViT
302
20
0
06 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual RecognitionIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
549
88
0
30 Oct 2023
Gramian Attention Heads are Strong yet Efficient Vision Learners
Gramian Attention Heads are Strong yet Efficient Vision LearnersIEEE International Conference on Computer Vision (ICCV), 2023
Jongbin Ryu
Dongyoon Han
J. Lim
223
3
0
25 Oct 2023
Handling Data Heterogeneity via Architectural Design for Federated
  Visual Recognition
Handling Data Heterogeneity via Architectural Design for Federated Visual RecognitionNeural Information Processing Systems (NeurIPS), 2023
Sara Pieri
Jose Renato Restom
Samuel Horvath
Hisham Cholakkal
FedML
161
9
0
23 Oct 2023
A Car Model Identification System for Streamlining the Automobile Sales
  Process
A Car Model Identification System for Streamlining the Automobile Sales Process
Said Togru
Marco Moldovan
207
0
0
19 Oct 2023
Distilling Efficient Vision Transformers from CNNs for Semantic
  Segmentation
Distilling Efficient Vision Transformers from CNNs for Semantic SegmentationPattern Recognition (Pattern Recogn.), 2023
Xueye Zheng
Yunhao Luo
Pengyuan Zhou
Lin Wang
218
29
0
11 Oct 2023
No Token Left Behind: Efficient Vision Transformer via Dynamic Token
  Idling
No Token Left Behind: Efficient Vision Transformer via Dynamic Token IdlingApplied Informatics (AI), 2023
Xuwei Xu
Changlin Li
Yudong Chen
Xiaojun Chang
Jiajun Liu
Sen Wang
ViT
228
9
0
09 Oct 2023
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision
  Transformers
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision TransformersInternational Conference on Digital Image Computing: Techniques and Applications (DICTA), 2023
Xuwei Xu
Sen Wang
Yudong Chen
Jiajun Liu
ViT
242
1
0
09 Oct 2023
Entropic Score metric: Decoupling Topology and Size in Training-free NAS
Entropic Score metric: Decoupling Topology and Size in Training-free NAS
Niccolò Cavagnero
Luc Robbiano
Francesca Pistilli
Barbara Caputo
Giuseppe Averta
181
3
0
06 Oct 2023
GET: Group Event Transformer for Event-Based Vision
GET: Group Event Transformer for Event-Based VisionIEEE International Conference on Computer Vision (ICCV), 2023
Yansong Peng
Yueyi Zhang
Zhiwei Xiong
Xiaoyan Sun
Feng Wu
195
73
0
04 Oct 2023
Algebras of actions in an agent's representations of the world
Algebras of actions in an agent's representations of the worldArtificial Intelligence (AIJ), 2023
Alexander Dean
Eduardo Alonso
Esther Mondragón
291
0
0
02 Oct 2023
Deep Model Fusion: A Survey
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedMLMoMe
296
87
0
27 Sep 2023
APIS: A paired CT-MRI dataset for ischemic stroke segmentation challenge
APIS: A paired CT-MRI dataset for ischemic stroke segmentation challengeScientific Reports (Sci Rep), 2023
Santiago Gómez
Daniela S. Mantilla
G. Garzón
Edgar Rangel
Andres Ortiz
Franklin Sierra-Jerez
Fabio Martínez
125
13
0
26 Sep 2023
Multi-Dimensional Hyena for Spatial Inductive Bias
Multi-Dimensional Hyena for Spatial Inductive BiasInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Itamar Zimerman
Lior Wolf
ViT
245
5
0
24 Sep 2023
Asca: less audio data is more insightful
Asca: less audio data is more insightful
Xiang Li
Jing Chen
Chao Li
Hongwu Lv
113
0
0
23 Sep 2023
Parameter and Computation Efficient Transfer Learning for
  Vision-Language Pre-trained Models
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained ModelsNeural Information Processing Systems (NeurIPS), 2023
Qiong Wu
Wei Yu
Weihao Ye
Shubin Huang
Xiaoshuai Sun
Rongrong Ji
VLM
259
10
0
04 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
250
41
0
04 Sep 2023
ExMobileViT: Lightweight Classifier Extension for Mobile Vision
  Transformer
ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer
Gyeongdong Yang
Yungwook Kwon
Hyunjin Kim
ViT
95
2
0
04 Sep 2023
Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer
  for Exposure Correction
Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure CorrectionACM Multimedia (ACM MM), 2023
Gehui Li
Jinyuan Liu
Long Ma
Zhiying Jiang
Xin-Yue Fan
Risheng Liu
281
12
0
02 Sep 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
303
32
0
27 Aug 2023
Semi-Supervised Semantic Segmentation via Marginal Contextual
  Information
Semi-Supervised Semantic Segmentation via Marginal Contextual Information
Moshe Kimhi
Shai Kimhi
Evgenii Zheltonozhskii
Or Litany
Chaim Baskin
304
18
0
26 Aug 2023
How Much Temporal Long-Term Context is Needed for Action Segmentation?
How Much Temporal Long-Term Context is Needed for Action Segmentation?IEEE International Conference on Computer Vision (ICCV), 2023
Emad Bahrami Rad
Gianpiero Francesca
Juergen Gall
ViT
235
45
0
22 Aug 2023
Global Features are All You Need for Image Retrieval and Reranking
Global Features are All You Need for Image Retrieval and RerankingIEEE International Conference on Computer Vision (ICCV), 2023
Shihao Shao
Kai-Hung Chen
Arjun Karpur
Qinghua Cui
A. Araújo
Bingyi Cao
204
61
0
14 Aug 2023
Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing
  Services
Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services
Zhichao Lu
Chuntao Ding
Shangguang Wang
Ran Cheng
Felix Juefei Xu
Vishnu Boddeti
VLM
118
7
0
12 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
205
16
0
10 Aug 2023
Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover Mapping
Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover MappingIEEE Geoscience and Remote Sensing Letters (GRSL), 2023
Ali Jamali
Swalpa Kumar Roy
Danfeng Hong
P. M. Atkinson
Pedram Ghamisi
110
15
0
09 Aug 2023
Distributionally Robust Classification on a Data Budget
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
Chinmay Hegde
OOD
249
2
0
07 Aug 2023
Frequency Disentangled Features in Neural Image Compression
Frequency Disentangled Features in Neural Image CompressionInternational Conference on Information Photonics (ICIP), 2023
Ali Zafari
Atefeh Khoshkhahtinat
P. Mehta
Mohammad Saeed Ebrahimi Saadabadi
Mohammad Akyash
Nasser M. Nasrabadi
223
17
0
04 Aug 2023
A Practical Deep Learning-Based Acoustic Side Channel Attack on
  Keyboards
A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards
Joshua Harrison
Ehsan Toreini
M. Mehrnezhad
AAML
140
25
0
02 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
PVG: Progressive Vision Graph for Vision RecognitionACM Multimedia (ACM MM), 2023
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
319
20
0
01 Aug 2023
Previous
12345...91011
Next