v1v2 (latest)

CoAtNet: Marrying Convolution and Attention for All Data Sizes

Neural Information Processing Systems (NeurIPS), 2021

9 June 2021

Mingxing Tan

Papers citing "CoAtNet: Marrying Convolution and Attention for All Data Sizes"

50 / 510 papers shown

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space ModelInternational Conference on Machine Learning (ICML), 2024

485

1,378

17 Jan 2024

SPFormer: Enhancing Vision Transformer with Superpixel Representation

Jieru Mei

Liang-Chieh Chen

Yaoyao Liu

Cihang Xie

ViT MDE

286

05 Jan 2024

A Cost-Efficient FPGA Implementation of Tiny Transformer Model using Neural ODE

Ikumi Okubo

Keisuke Sugiura

Hiroki Matsutani

240

05 Jan 2024

A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction RecognitionChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023

Ruoqi Yin

Jianqin Yin

ViT

196

31 Dec 2023

Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation

197

26 Dec 2023

Transformer-Based Multi-Object Smoothing with Decoupled Data Association and Smoothing

Lennart Svensson

203

22 Dec 2023

Delving Deeper Into Astromorphic Transformers

Md. Zesun Ahmed Mia

Malyaban Bal

Abhronil Sengupta

494

18 Dec 2023

ADF & TransApp: A Transformer-Based Framework for Appliance Detection Using Smart Meter Consumption Series

237

17 Dec 2023

Factorization Vision Transformer: Modeling Long Range Dependency with Local Window CostIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023

221

14 Dec 2023

A Novel Image Classification Framework Based on Variational Quantum AlgorithmsQuantum Information Processing (QIP), 2023

Yixiong Chen

261

13 Dec 2023

MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation

321

11 Dec 2023

Activating Frequency and ViT for 3D Point Cloud Quality Assessment without Reference

172

10 Dec 2023

Rejuvenating image-GPT as Strong Visual Representation LearnersInternational Conference on Machine Learning (ICML), 2023

Cihang Xie

283

04 Dec 2023

SCHEME: Scalable Channel Mixer for Vision Transformers

Deepak Sridhar

Yunsheng Li

Nuno Vasconcelos

808

01 Dec 2023

Cell Maps Representation For Lung Adenocarcinoma Growth Patterns Classification In Whole Slide ImagesIEEE International Symposium on Biomedical Imaging (ISBI), 2023

182

27 Nov 2023

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image RecognitionComputer Vision and Pattern Recognition (CVPR), 2023

Sijie Zhao

Ying Shan

262

229

27 Nov 2023

Deep Tensor Network

Yifan Zhang

375

18 Nov 2023

SynthEnsemble: A Fusion of CNN, Vision Transformer, and Hybrid Models for Multi-Label Chest X-Ray Classification

S. M. N. Ashraf

Md. Adyelullahil Mamun

Hasnat Md. Abdullah

Rabiul Alam

ViT MedIm

263

13 Nov 2023

Dual input stream transformer for vertical drift correction in eye-tracking reading dataIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

199

10 Nov 2023

Vision Big Bird: Random Sparsification for Full Attention

Zhemin Zhang

Xun Gong

ViT

163

10 Nov 2023

GTP-ViT: Efficient Vision Transformers via Graph-based Token PropagationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

307

06 Nov 2023

TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual RecognitionIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023

Chuan Wu

Yizhou Yu

ViT

552

30 Oct 2023

Gramian Attention Heads are Strong yet Efficient Vision LearnersIEEE International Conference on Computer Vision (ICCV), 2023

Jongbin Ryu

Dongyoon Han

J. Lim

229

25 Oct 2023

Handling Data Heterogeneity via Architectural Design for Federated Visual RecognitionNeural Information Processing Systems (NeurIPS), 2023

Hisham Cholakkal

164

23 Oct 2023

A Car Model Identification System for Streamlining the Automobile Sales Process

Said Togru

Marco Moldovan

207

19 Oct 2023

Distilling Efficient Vision Transformers from CNNs for Semantic SegmentationPattern Recognition (Pattern Recogn.), 2023

Xueye Zheng

Yunhao Luo

Pengyuan Zhou

Lin Wang

220

11 Oct 2023

No Token Left Behind: Efficient Vision Transformer via Dynamic Token IdlingApplied Informatics (AI), 2023

Xiaojun Chang

229

09 Oct 2023

Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision TransformersInternational Conference on Digital Image Computing: Techniques and Applications (DICTA), 2023

244

09 Oct 2023

Entropic Score metric: Decoupling Topology and Size in Training-free NAS

181

06 Oct 2023

GET: Group Event Transformer for Event-Based VisionIEEE International Conference on Computer Vision (ICCV), 2023

Yueyi Zhang

195

04 Oct 2023

Algebras of actions in an agent's representations of the worldArtificial Intelligence (AIJ), 2023

Alexander Dean

Eduardo Alonso

Esther Mondragón

291

02 Oct 2023

Deep Model Fusion: A Survey

Liang Ding

Li Shen

302

27 Sep 2023

APIS: A paired CT-MRI dataset for ischemic stroke segmentation challengeScientific Reports (Sci Rep), 2023

Santiago Gómez

Daniela S. Mantilla

G. Garzón

Edgar Rangel

Andres Ortiz

Franklin Sierra-Jerez

Fabio Martínez

126

26 Sep 2023

Multi-Dimensional Hyena for Spatial Inductive BiasInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

Itamar Zimerman

Lior Wolf

ViT

250

24 Sep 2023

Asca: less audio data is more insightful

Xiang Li

Jing Chen

Chao Li

Hongwu Lv

114

23 Sep 2023

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained ModelsNeural Information Processing Systems (NeurIPS), 2023

Qiong Wu

259

04 Sep 2023

DAT++: Spatially Dynamic Vision Transformer with Deformable Attention

Gao Huang

250

04 Sep 2023

ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer

04 Sep 2023

Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure CorrectionACM Multimedia (ACM MM), 2023

282

02 Sep 2023

Computation-efficient Deep Learning for Computer Vision: A Survey

Yulin Wang

Gao Huang

303

27 Aug 2023

Semi-Supervised Semantic Segmentation via Marginal Contextual Information

Moshe Kimhi

Shai Kimhi

Evgenii Zheltonozhskii

Or Litany

Chaim Baskin

304

26 Aug 2023

How Much Temporal Long-Term Context is Needed for Action Segmentation?IEEE International Conference on Computer Vision (ICCV), 2023

Emad Bahrami Rad

Gianpiero Francesca

Juergen Gall

ViT

240

22 Aug 2023

Global Features are All You Need for Image Retrieval and RerankingIEEE International Conference on Computer Vision (ICCV), 2023

205

14 Aug 2023

Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

Ran Cheng

118

12 Aug 2023

Temporally-Adaptive Models for Efficient Video Understanding

Ziwei Liu

205

10 Aug 2023

Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover MappingIEEE Geoscience and Remote Sensing Letters (GRSL), 2023

110

09 Aug 2023

Distributionally Robust Classification on a Data Budget

249

07 Aug 2023

Frequency Disentangled Features in Neural Image CompressionInternational Conference on Information Photonics (ICIP), 2023

Ali Zafari

Atefeh Khoshkhahtinat

P. Mehta

Mohammad Saeed Ebrahimi Saadabadi

Mohammad Akyash

Nasser M. Nasrabadi

226

04 Aug 2023

A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards

143

02 Aug 2023

PVG: Progressive Vision Graph for Vision RecognitionACM Multimedia (ACM MM), 2023

Jiangning Zhang

Yabiao Wang

Chengjie Wang

ViT

320

01 Aug 2023