ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04803
  4. Cited By
CoAtNet: Marrying Convolution and Attention for All Data Sizes

CoAtNet: Marrying Convolution and Attention for All Data Sizes

9 June 2021
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
    ViT
ArXivPDFHTML

Papers citing "CoAtNet: Marrying Convolution and Attention for All Data Sizes"

50 / 482 papers shown
Title
Deep Model Fusion: A Survey
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
28
51
0
27 Sep 2023
Multi-Dimensional Hyena for Spatial Inductive Bias
Multi-Dimensional Hyena for Spatial Inductive Bias
Itamar Zimerman
Lior Wolf
ViT
25
4
0
24 Sep 2023
Asca: less audio data is more insightful
Asca: less audio data is more insightful
Xiang Li
Jing Chen
Chao Li
Hongwu Lv
12
0
0
23 Sep 2023
Parameter and Computation Efficient Transfer Learning for
  Vision-Language Pre-trained Models
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
Qiong Wu
Wei Yu
Yiyi Zhou
Shubin Huang
Xiaoshuai Sun
R. Ji
VLM
24
7
0
04 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
19
24
0
04 Sep 2023
ExMobileViT: Lightweight Classifier Extension for Mobile Vision
  Transformer
ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer
Gyeongdong Yang
Yungwook Kwon
Hyunjin Kim
ViT
16
1
0
04 Sep 2023
Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer
  for Exposure Correction
Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction
Gehui Li
Jinyuan Liu
Long Ma
Zhiying Jiang
Xin-Yue Fan
Risheng Liu
18
6
0
02 Sep 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
26
20
0
27 Aug 2023
Semi-Supervised Semantic Segmentation via Marginal Contextual
  Information
Semi-Supervised Semantic Segmentation via Marginal Contextual Information
Moshe Kimhi
Shai Kimhi
Evgenii Zheltonozhskii
Or Litany
Chaim Baskin
24
10
0
26 Aug 2023
How Much Temporal Long-Term Context is Needed for Action Segmentation?
How Much Temporal Long-Term Context is Needed for Action Segmentation?
Emad Bahrami Rad
Gianpiero Francesca
Juergen Gall
ViT
11
24
0
22 Aug 2023
Global Features are All You Need for Image Retrieval and Reranking
Global Features are All You Need for Image Retrieval and Reranking
Shihao Shao
Kai-Hung Chen
Arjun Karpur
Qinghua Cui
A. Araújo
Bingyi Cao
10
38
0
14 Aug 2023
Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing
  Services
Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services
Zhichao Lu
Chuntao Ding
Shangguang Wang
Ran Cheng
Felix Juefei Xu
Vishnu Naresh Boddeti
VLM
15
4
0
12 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
28
9
0
10 Aug 2023
Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover Mapping
Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover Mapping
Ali Jamali
S. K. Roy
Danfeng Hong
P. M. Atkinson
Pedram Ghamisi
17
11
0
09 Aug 2023
Distributionally Robust Classification on a Data Budget
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
27
2
0
07 Aug 2023
Frequency Disentangled Features in Neural Image Compression
Frequency Disentangled Features in Neural Image Compression
Ali Zafari
Atefeh Khoshkhahtinat
P. Mehta
Mohammad Saeed Ebrahimi Saadabadi
Mohammad Akyash
Nasser M. Nasrabadi
37
14
0
04 Aug 2023
A Practical Deep Learning-Based Acoustic Side Channel Attack on
  Keyboards
A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards
Joshua Harrison
Ehsan Toreini
M. Mehrnezhad
AAML
13
18
0
02 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
PVG: Progressive Vision Graph for Vision Recognition
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
15
12
0
01 Aug 2023
Sparse Double Descent in Vision Transformers: real or phantom threat?
Sparse Double Descent in Vision Transformers: real or phantom threat?
Victor Quétu
Marta Milovanović
Enzo Tartaglione
16
2
0
26 Jul 2023
Adaptive Frequency Filters As Efficient Global Token Mixers
Adaptive Frequency Filters As Efficient Global Token Mixers
Zhipeng Huang
Zhizheng Zhang
Cuiling Lan
Zhengjun Zha
Yan Lu
B. Guo
25
36
0
26 Jul 2023
ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal
  Methods Boosted by Ensemble Learning, Syntactical and Entity Features
ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features
Umitcan Sahin
Izzet Emre Kucukkaya
Oguzhan Ozcelik
Cagri Toraman
22
10
0
25 Jul 2023
Robust face anti-spoofing framework with Convolutional Vision
  Transformer
Robust face anti-spoofing framework with Convolutional Vision Transformer
Yunseung Lee
Youngjun Kwak
Jinho Shin
CVBM
ViT
15
5
0
24 Jul 2023
Bone mineral density estimation from a plain X-ray image by learning
  decomposition into projections of bone-segmented computed tomography
Bone mineral density estimation from a plain X-ray image by learning decomposition into projections of bone-segmented computed tomography
Yidong Gu
Yoshito Otake
Keisuke Uemura
Mazen Soufi
Masaki Takao
Hugues Talbot
S. Okada
Nobuhiko Sugano
Yoshinobu Sato
OOD
16
10
0
21 Jul 2023
Meta-Transformer: A Unified Framework for Multimodal Learning
Meta-Transformer: A Unified Framework for Multimodal Learning
Yiyuan Zhang
Kaixiong Gong
Kaipeng Zhang
Hongsheng Li
Yu Qiao
Wanli Ouyang
Xiangyu Yue
19
136
0
20 Jul 2023
RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching
  Detection
RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching Detection
Qichao Ying
Jiaxin Liu
Sheng Li
Haisheng Xu
Zhenxing Qian
Xinpeng Zhang
CVBM
22
7
0
20 Jul 2023
RepViT: Revisiting Mobile CNN From ViT Perspective
RepViT: Revisiting Mobile CNN From ViT Perspective
Ao Wang
Hui Chen
Zijia Lin
Hengjun Pu
Guiguang Ding
27
173
0
18 Jul 2023
Scale-Aware Modulation Meet Transformer
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
20
66
0
17 Jul 2023
Vision Language Transformers: A Survey
Vision Language Transformers: A Survey
Clayton Fields
C. Kennington
VLM
15
5
0
06 Jul 2023
EgoCOL: Egocentric Camera pose estimation for Open-world 3D object
  Localization @Ego4D challenge 2023
EgoCOL: Egocentric Camera pose estimation for Open-world 3D object Localization @Ego4D challenge 2023
Cristhian Forigua
María Escobar
Jordi Pont-Tuset
Kevis-Kokitsi Maninis
Pablo Arbelaez
EgoV
14
1
0
29 Jun 2023
Unfolding Framework with Prior of Convolution-Transformer Mixture and
  Uncertainty Estimation for Video Snapshot Compressive Imaging
Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging
Siming Zheng
Xin Yuan
ViT
MedIm
8
5
0
20 Jun 2023
Reviving Shift Equivariance in Vision Transformers
Reviving Shift Equivariance in Vision Transformers
Peijian Ding
Davit Soselia
Thomas Armstrong
Jiahao Su
Furong Huang
17
6
0
13 Jun 2023
2-D SSM: A General Spatial Layer for Visual Transformers
2-D SSM: A General Spatial Layer for Visual Transformers
Ethan Baron
Itamar Zimerman
Lior Wolf
23
14
0
11 Jun 2023
FasterViT: Fast Vision Transformers with Hierarchical Attention
FasterViT: Fast Vision Transformers with Hierarchical Attention
Ali Hatamizadeh
Greg Heinrich
Hongxu Yin
Andrew Tao
J. Álvarez
Jan Kautz
Pavlo Molchanov
ViT
17
66
0
09 Jun 2023
Multi-Architecture Multi-Expert Diffusion Models
Multi-Architecture Multi-Expert Diffusion Models
Yunsung Lee
Jin-Young Kim
Hyojun Go
Myeongho Jeong
Shinhyeok Oh
Seungtaek Choi
DiffM
26
29
0
08 Jun 2023
DeltaNN: Assessing the Impact of Computational Environment Parameters on
  the Performance of Image Recognition Models
DeltaNN: Assessing the Impact of Computational Environment Parameters on the Performance of Image Recognition Models
Nikolaos Louloudakis
Perry Gibson
José Cano
A. Rajan
9
8
0
05 Jun 2023
GAT-GAN : A Graph-Attention-based Time-Series Generative Adversarial
  Network
GAT-GAN : A Graph-Attention-based Time-Series Generative Adversarial Network
Srikrishna Iyer
Teck-Hou Teng
AI4TS
11
1
0
03 Jun 2023
Brainformers: Trading Simplicity for Efficiency
Brainformers: Trading Simplicity for Efficiency
Yan-Quan Zhou
Nan Du
Yanping Huang
Daiyi Peng
Chang Lan
...
Zhifeng Chen
Quoc V. Le
Claire Cui
J.H.J. Laundon
J. Dean
MoE
8
23
0
29 May 2023
Manifold Regularization for Memory-Efficient Training of Deep Neural
  Networks
Manifold Regularization for Memory-Efficient Training of Deep Neural Networks
Shadi Sartipi
Edgar A. Bernal
22
0
0
26 May 2023
BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained
  Transformer for Vision, Language, and Multimodal Tasks
BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks
Kai Zhang
Jun Yu
Eashan Adhikarla
Rong-Er Zhou
Zhilin Yan
...
Xun Chen
Yong Chen
Quanzheng Li
Hongfang Liu
Lichao Sun
LM&MA
MedIm
24
154
0
26 May 2023
Improving Position Encoding of Transformers for Multivariate Time Series
  Classification
Improving Position Encoding of Transformers for Multivariate Time Series Classification
Navid Mohammadi Foumani
Chang Wei Tan
Geoffrey I. Webb
Mahsa Salehi
AI4TS
22
72
0
26 May 2023
ViTMatte: Boosting Image Matting with Pretrained Plain Vision
  Transformers
ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers
J. Yao
Xinggang Wang
Shusheng Yang
Baoyuan Wang
ViT
27
57
0
24 May 2023
Dual Path Transformer with Partition Attention
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
34
2
0
24 May 2023
Evolution: A Unified Formula for Feature Operators from a High-level
  Perspective
Evolution: A Unified Formula for Feature Operators from a High-level Perspective
Zhicheng Cai
11
0
0
23 May 2023
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Ibrahim M. Alabdulmohsin
Xiaohua Zhai
Alexander Kolesnikov
Lucas Beyer
VLM
27
57
0
22 May 2023
Low-Earth Satellite Orbit Determination Using Deep Convolutional
  Networks with Satellite Imagery
Low-Earth Satellite Orbit Determination Using Deep Convolutional Networks with Satellite Imagery
Rohit Khorana
14
1
0
20 May 2023
GELU Activation Function in Deep Learning: A Comprehensive Mathematical
  Analysis and Performance
GELU Activation Function in Deep Learning: A Comprehensive Mathematical Analysis and Performance
Minhyeok Lee
13
29
0
20 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
16
114
0
18 May 2023
Mimetic Initialization of Self-Attention Layers
Mimetic Initialization of Self-Attention Layers
Asher Trockman
J. Zico Kolter
28
30
0
16 May 2023
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
Abdul Rehman Khan
Asifullah Khan
ViT
MedIm
34
14
0
15 May 2023
OneCAD: One Classifier for All image Datasets using multimodal learning
OneCAD: One Classifier for All image Datasets using multimodal learning
S. Wadekar
Eugenio Culurciello
32
0
0
11 May 2023
Previous
12345...8910
Next