v1v2 (latest)

CoAtNet: Marrying Convolution and Attention for All Data Sizes

Neural Information Processing Systems (NeurIPS), 2021

9 June 2021

Mingxing Tan

Papers citing "CoAtNet: Marrying Convolution and Attention for All Data Sizes"

50 / 510 papers shown

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language ModelComputer Vision and Pattern Recognition (CVPR), 2022

Yaqing Wang

Caiwen Ding

Dongkuan Xu

221

21 Nov 2022

Vision Transformers in Medical Imaging: A Review

258

18 Nov 2022

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual InformationComputer Vision and Pattern Recognition (CVPR), 2022

Weijie Su

Gao Huang

Yu Qiao

Xiaogang Wang

Jie Zhou

Jifeng Dai

245

17 Nov 2022

AligNeRF: High-Fidelity Neural Radiance Fields via Alignment-Aware TrainingComputer Vision and Pattern Recognition (CVPR), 2022

215

17 Nov 2022

EVA: Exploring the Limits of Masked Visual Representation Learning at ScaleComputer Vision and Pattern Recognition (CVPR), 2022

621

907

14 Nov 2022

ParCNetV2: Oversized Kernel with Enhanced AttentionIEEE International Conference on Computer Vision (ICCV), 2022

279

14 Nov 2022

BiViT: Extremely Compressed Binary Vision TransformerIEEE International Conference on Computer Vision (ICCV), 2022

Bohan Zhuang

266

14 Nov 2022

A Comprehensive Survey of Transformers for Computer Vision

158

11 Nov 2022

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable ConvolutionsComputer Vision and Pattern Recognition (CVPR), 2022

...

Yu Qiao

556

971

10 Nov 2022

Demystify Transformers & Convolutions in Modern Image Deep NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

...

289

10 Nov 2022

MogaNet: Multi-order Gated Aggregation NetworkInternational Conference on Learning Representations (ICLR), 2022

285

125

07 Nov 2022

SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision TransformersACM Multimedia Asia (MA), 2022

Alessandro Arezzo

Stefano Berretti

ViT

122

04 Nov 2022

Boosting Binary Neural Networks via Dynamic Thresholds Learning

Xueyang Zhang

257

04 Nov 2022

Exploring Effects of Computational Parameter Changes to Image Recognition Systems

221

01 Nov 2022

Accelerating Certified Robustness Training via Knowledge TransferNeural Information Processing Systems (NeurIPS), 2022

Pratik Vaishnavi

Kevin Eykholt

Amir Rahmati

202

25 Oct 2022

The Curious Case of Benign MemorizationInternational Conference on Learning Representations (ICLR), 2022

360

25 Oct 2022

DialogConv: A Lightweight Fully Convolutional Network for Multi-view Response SelectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

149

25 Oct 2022

Synthetic Data Supervised Salient Object DetectionACM Multimedia (ACM MM), 2022

Chenglizhao Chen

179

25 Oct 2022

MetaFormer Baselines for VisionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Weihao Yu

248

277

24 Oct 2022

Drastically Reducing the Number of Trainable Parameters in Deep CNNs by Inter-layer Kernel-sharing

Alireza Azadbakht

Saeed Reza Kheradpisheh

Ismail Khalfaoui-Hassani

T. Masquelier

162

23 Oct 2022

Similarity of Neural Architectures using Adversarial Attack TransferabilityEuropean Conference on Computer Vision (ECCV), 2022

546

20 Oct 2022

A Survey of Computer Vision Technologies In Urban and Controlled-environment AgricultureACM Computing Surveys (ACM CSUR), 2022

Jiayun Luo

Boyang Albert Li

Cyril Leung

377

20 Oct 2022

Scaling & Shifting Your Features: A New Baseline for Efficient Model TuningNeural Information Processing Systems (NeurIPS), 2022

359

335

17 Oct 2022

SWFormer: Sparse Window Transformer for 3D Object Detection in Point CloudsEuropean Conference on Computer Vision (ECCV), 2022

Mingxing Tan

255

155

13 Oct 2022

Vision Transformers provably learn spatial structureNeural Information Processing Systems (NeurIPS), 2022

226

102

13 Oct 2022

Compute-Efficient Deep Learning: Algorithmic Trends and OpportunitiesJournal of machine learning research (JMLR), 2022

Brian Bartoldson

B. Kailkhura

Davis W. Blalock

317

13 Oct 2022

Fast-ParC: Capturing Position Aware Global Feature for ConvNets and ViTs

226

08 Oct 2022

Visualize Before You Write: Imagination-Guided Open-Ended Text GenerationFindings (Findings), 2022

324

07 Oct 2022

The Lie Derivative for Measuring Learned EquivarianceInternational Conference on Learning Representations (ICLR), 2022

296

06 Oct 2022

MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision ModelsInternational Conference on Learning Representations (ICLR), 2022

Siyuan Qiao

325

04 Oct 2022

Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling

139

04 Oct 2022

Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Xiao Luo

259

03 Oct 2022

Attention Distillation: self-supervised vision transformer students need more guidanceBritish Machine Vision Conference (BMVC), 2022

162

03 Oct 2022

An In-depth Study of Stochastic BackpropagationNeural Information Processing Systems (NeurIPS), 2022

Hao Chen

163

30 Sep 2022

E-Branchformer: Branchformer with Enhanced merging for speech recognitionSpoken Language Technology Workshop (SLT), 2022

Kwangyoun Kim

408

160

30 Sep 2022

MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features

S. Wadekar

Abhishek Chaurasia

ViT

313

143

30 Sep 2022

Exploring the Relationship between Architecture and Adversarially Robust GeneralizationComputer Vision and Pattern Recognition (CVPR), 2022

Xianglong Liu

232

28 Sep 2022

Attention is All They Need: Exploring the Media Archaeology of the Computer Vision Research Paper

268

22 Sep 2022

Mega: Moving Average Equipped Gated AttentionInternational Conference on Learning Representations (ICLR), 2022

Graham Neubig

Luke Zettlemoyer

339

219

21 Sep 2022

Axially Expanded Windows for Local-Global Interaction in Vision Transformers

Zhemin Zhang

Xun Gong

ViT

146

19 Sep 2022

VINet: Visual and Inertial-based Terrain Classification and Adaptive Navigation over Unknown TerrainIEEE International Conference on Robotics and Automation (ICRA), 2022

204

16 Sep 2022

Neural Networks Reduction via LumpingInternational Conference of the Italian Association for Artificial Intelligence (AIxIA), 2022

226

15 Sep 2022

Joint Debiased Representation and Image Clustering Learning with Self-Supervision

147

14 Sep 2022

Revisiting Neural Scaling Laws in Language and VisionNeural Information Processing Systems (NeurIPS), 2022

Ibrahim Alabdulmohsin

Behnam Neyshabur

Xiaohua Zhai

498

144

13 Sep 2022

Socially Enhanced Situation Awareness from Microblogs using Artificial Intelligence: A SurveyACM Computing Surveys (ACM CSUR), 2022

Rabindra Lamsal

Aaron Harwood

M. Read

271

13 Sep 2022

Communication-Efficient and Privacy-Preserving Feature-based Federated Transfer LearningGlobal Communications Conference (GLOBECOM), 2022

Feng Wang

M. C. Gursoy

Senem Velipasalar

262

12 Sep 2022

Statistical Foundation Behind Machine Learning and Its Impact on Computer Vision

Lei Zhang

H. Shum

VLM SSL

144

06 Sep 2022

AutoPET Challenge: Combining nn-Unet with Swin UNETR Augmented by Maximum Intensity Projection Classifier

...

103

02 Sep 2022

MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual RecognitionNeurocomputing (Neurocomputing), 2022

Errui Ding

162

31 Aug 2022

MRL: Learning to Mix with Attention and Convolutions

Shlok Mohta

Hisahiro Suganuma

Yoshiki Tanaka

236

30 Aug 2022