ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04803
  4. Cited By
CoAtNet: Marrying Convolution and Attention for All Data Sizes

CoAtNet: Marrying Convolution and Attention for All Data Sizes

9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
    ViT
ArXivPDFHTML

Papers citing "CoAtNet: Marrying Convolution and Attention for All Data Sizes"

50 / 482 papers shown
Title
Dreaming is All You Need
Dreaming is All You Need
Mingze Ni
Wei Liu
33
0
0
03 Sep 2024
A Preliminary Exploration Towards General Image Restoration
A Preliminary Exploration Towards General Image Restoration
Xiangtao Kong
Jinjin Gu
Yihao Liu
Wenlong Zhang
Xiangyu Chen
Yu Qiao
Chao Dong
DiffM
38
2
0
27 Aug 2024
LoG-VMamba: Local-Global Vision Mamba for Medical Image Segmentation
LoG-VMamba: Local-Global Vision Mamba for Medical Image Segmentation
Trung Dang
Huy Hoang Nguyen
A. Tiulpin
Mamba
27
3
0
26 Aug 2024
Accuracy Improvement of Cell Image Segmentation Using Feedback Former
Accuracy Improvement of Cell Image Segmentation Using Feedback Former
Hinako Mitsuoka
Kazuhiro Hotta
ViT
MedIm
26
0
0
23 Aug 2024
Sapiens: Foundation for Human Vision Models
Sapiens: Foundation for Human Vision Models
Rawal Khirodkar
Timur M. Bagautdinov
Julieta Martinez
Su Zhaoen
Austin James
Peter Selednik
Stuart Anderson
Shunsuke Saito
VLM
36
63
0
22 Aug 2024
HcNet: Image Modeling with Heat Conduction Equation
HcNet: Image Modeling with Heat Conduction Equation
Zhemin Zhang
Xun Gong
DiffM
3DV
35
0
0
12 Aug 2024
Enhancing 3D Transformer Segmentation Model for Medical Image with
  Token-level Representation Learning
Enhancing 3D Transformer Segmentation Model for Medical Image with Token-level Representation Learning
Xinrong Hu
Dewen Zeng
Yawen Wu
Xueyang Li
Yiyu Shi
ViT
MedIm
39
0
0
12 Aug 2024
DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain
  Feature Extraction and Interaction Attention
DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention
Wei Wang
Jixing He
Xin Wang
32
0
0
30 Jul 2024
Lite-SAM Is Actually What You Need for Segment Everything
Lite-SAM Is Actually What You Need for Segment Everything
Jianhai Fu
Yuanjie Yu
Ningchuan Li
Yi Zhang
Qichao Chen
Jianping Xiong
Jun Yin
Zhiyu Xiang
VLM
34
4
0
12 Jul 2024
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa
Qin Shiyin
Abdulrahman Hamman Adama Chukkol
Isah Bello
A. Lawan
Isah Bello
39
4
0
10 Jul 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
27
1
0
10 Jul 2024
Exploring Camera Encoder Designs for Autonomous Driving Perception
Exploring Camera Encoder Designs for Autonomous Driving Perception
Barath Lakshmanan
Joshua Chen
Shiyi Lan
Maying Shen
Zhiding Yu
Jose M. Alvarez
37
0
0
09 Jul 2024
CTRL-F: Pairing Convolution with Transformer for Image Classification
  via Multi-Level Feature Cross-Attention and Representation Learning Fusion
CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion
Hosam S. El-Assiouti
Hadeer El-Saadawy
M. Al-Berry
M. Tolba
ViT
47
0
0
09 Jul 2024
RepNeXt: A Fast Multi-Scale CNN using Structural Reparameterization
RepNeXt: A Fast Multi-Scale CNN using Structural Reparameterization
Mingshu Zhao
Yi Luo
Yong Ouyang
32
2
0
23 Jun 2024
Semantic Graph Consistency: Going Beyond Patches for Regularizing
  Self-Supervised Vision Transformers
Semantic Graph Consistency: Going Beyond Patches for Regularizing Self-Supervised Vision Transformers
Chaitanya Devaguptapu
Sumukh K. Aithal
Shrinivas Ramasubramanian
Moyuru Yamada
Manohar Kaul
ViT
29
0
0
18 Jun 2024
Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with
  Latency Constraint
Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose Alvarez
VLM
36
3
0
17 Jun 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
52
0
0
13 Jun 2024
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision
  Transformer
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu
Tong Zhang
Sabine Süsstrunk
ViT
36
0
0
12 Jun 2024
Towards Fundamentally Scalable Model Selection: Asymptotically Fast
  Update and Selection
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection
Wenxiao Wang
Weiming Zhuang
Lingjuan Lyu
32
0
0
11 Jun 2024
ReduceFormer: Attention with Tensor Reduction by Summation
ReduceFormer: Attention with Tensor Reduction by Summation
John Yang
Le An
Su Inn Park
26
0
0
11 Jun 2024
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor
  Control
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Dongyoon Hwang
ByungKun Lee
Hojoon Lee
Hyunseung Kim
Jaegul Choo
42
0
0
10 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
46
6
0
06 Jun 2024
Convolutional Neural Networks and Vision Transformers for Fashion MNIST
  Classification: A Literature Review
Convolutional Neural Networks and Vision Transformers for Fashion MNIST Classification: A Literature Review
Sonia Bbouzidi
Ghazala Hcini
Imen Jdey
Fadoua Drira
18
4
0
05 Jun 2024
GrootVL: Tree Topology is All You Need in State Space Model
GrootVL: Tree Topology is All You Need in State Space Model
Yicheng Xiao
Lin Song
Shaoli Huang
Jiangshan Wang
Siyu Song
Yixiao Ge
Xiu Li
Ying Shan
Mamba
36
10
0
04 Jun 2024
Image Captioning via Dynamic Path Customization
Image Captioning via Dynamic Path Customization
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Xiaopeng Hong
Yongjian Wu
Rongrong Ji
27
0
0
01 Jun 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
24
5
0
29 May 2024
ViG: Linear-complexity Visual Sequence Learning with Gated Linear
  Attention
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao
Xinggang Wang
Lianghui Zhu
Qian Zhang
Chang Huang
45
4
0
28 May 2024
Building Vision Models upon Heat Conduction
Building Vision Models upon Heat Conduction
Zhaozhi Wang
Yue Liu
Yunfan Liu
Hongtian Yu
Yaowei Wang
QiXiang Ye
ViT
VLM
50
0
0
26 May 2024
Smooth Pseudo-Labeling
Smooth Pseudo-Labeling
Nikolaos Karaliolios
Hervé Le Borgne
Florian Chabot
34
0
0
23 May 2024
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model
Yuheng Shi
Minjing Dong
Chang Xu
Mamba
35
32
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
Infinite-Dimensional Feature Interaction
Infinite-Dimensional Feature Interaction
Chenhui Xu
Fuxun Yu
Maoliang Li
Zihao Zheng
Zirui Xu
Jinjun Xiong
Xiang Chen
34
1
0
22 May 2024
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in
  Large-Scale AI Models
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models
Zhaojian Yu
Yinghao Wu
Zhuotao Deng
Yansong Tang
Xiao-Ping Zhang
44
2
0
21 May 2024
MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding
  Space Binding
MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding
Jiajie Teng
Huiyu Duan
Yucheng Zhu
Sijing Wu
Guangtao Zhai
29
2
0
15 May 2024
Feature-based Federated Transfer Learning: Communication Efficiency,
  Robustness and Privacy
Feature-based Federated Transfer Learning: Communication Efficiency, Robustness and Privacy
Feng Wang
M. C. Gursoy
Senem Velipasalar
24
0
0
15 May 2024
MambaOut: Do We Really Need Mamba for Vision?
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
39
47
0
13 May 2024
Information-driven Affordance Discovery for Efficient Robotic
  Manipulation
Information-driven Affordance Discovery for Efficient Robotic Manipulation
Pietro Mazzaglia
Taco Cohen
Daniel Dijkman
35
2
0
06 May 2024
UniGen: Unified Modeling of Initial Agent States and Trajectories for
  Generating Autonomous Driving Scenarios
UniGen: Unified Modeling of Initial Agent States and Trajectories for Generating Autonomous Driving Scenarios
R. Mahjourian
Rongbing Mu
Valerii Likhosherstov
Paul Mougin
Xiukun Huang
Joao Messias
Shimon Whiteson
24
7
0
06 May 2024
Fusing Depthwise and Pointwise Convolutions for Efficient Inference on
  GPUs
Fusing Depthwise and Pointwise Convolutions for Efficient Inference on GPUs
Fareed Qararyah
M. Azhar
Mohammad Ali Maleki
Pedro Trancoso
21
1
0
30 Apr 2024
SmartMem: Layout Transformation Elimination and Adaptation for Efficient
  DNN Execution on Mobile
SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile
Wei Niu
Md. Musfiqur Rahman Sanim
Zhihao Shu
Jiexiong Guan
Xipeng Shen
Miao Yin
Gagan Agrawal
Bin Ren
30
6
0
21 Apr 2024
Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures
Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures
Ching-Kai Lin
Di-Chun Wei
Yun-Chien Cheng
27
0
0
09 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
50
48
0
08 Apr 2024
Learning Correlation Structures for Vision Transformers
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
24
7
0
05 Apr 2024
Semantic Augmentation in Images using Language
Semantic Augmentation in Images using Language
Sahiti Yerramilli
Jayant Sravan Tamarapalli
Tanmay Girish Kulkarni
Jonathan M Francis
Eric Nyberg
VLM
DiffM
24
0
0
02 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan L. Yuille
Liang-Chieh Chen
3DV
VLM
28
24
0
02 Apr 2024
Structured Initialization for Attention in Vision Transformers
Structured Initialization for Attention in Vision Transformers
Jianqiao Zheng
Xueqian Li
Simon Lucey
ViT
21
0
0
01 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
Donghyun Kim
Byeongho Heo
Dongyoon Han
40
13
0
28 Mar 2024
Tiny Models are the Computational Saver for Large Models
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang
B. Cardiff
Antoine Frappé
Benoît Larras
Deepu John
29
2
0
26 Mar 2024
Neural Clustering based Visual Representation Learning
Neural Clustering based Visual Representation Learning
Guikun Chen
Xia Li
Yi Yang
Wenguan Wang
SSL
27
8
0
26 Mar 2024
Previous
12345...8910
Next