Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.01820
Cited By
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
4 October 2022
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models"
50 / 53 papers shown
Title
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
57
0
0
03 Apr 2025
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
28
0
0
12 Nov 2024
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Anil Kag
Huseyin Coskun
Jierun Chen
Junli Cao
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
Jian Ren
40
2
0
07 Nov 2024
DCT-HistoTransformer: Efficient Lightweight Vision Transformer with DCT Integration for histopathological image analysis
Mahtab Ranjbar
Mehdi Mohebbi
Mahdi Cherakhloo
Bijan Vosoughi. Vahdat
MedIm
18
0
0
24 Oct 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
29
13
0
15 Oct 2024
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model
Yongjie Fu
Yunlong Li
Xuan Di
VGen
27
2
0
28 Aug 2024
AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs
Yunling Zheng
Zeyi Xu
Fanghui Xue
Biao Yang
Jiancheng Lyu
Shuai Zhang
Y. Qi
Jack Xin
33
0
0
16 Jul 2024
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
Pierre-David Létourneau
Manish Kumar Singh
Hsin-Pai Cheng
Shizhong Han
Yunxiao Shi
Dalton Jones
M. H. Langston
Hong Cai
Fatih Porikli
21
0
0
16 Jul 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
24
1
0
10 Jul 2024
Exploring Camera Encoder Designs for Autonomous Driving Perception
Barath Lakshmanan
Joshua Chen
Shiyi Lan
Maying Shen
Zhiding Yu
Jose M. Alvarez
31
0
0
09 Jul 2024
CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion
Hosam S. El-Assiouti
Hadeer El-Saadawy
M. Al-Berry
M. Tolba
ViT
39
0
0
09 Jul 2024
Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose Alvarez
VLM
25
3
0
17 Jun 2024
Fusion of regional and sparse attention in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
21
0
0
13 Jun 2024
Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
30
3
0
22 May 2024
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
36
4
0
22 May 2024
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
31
46
0
13 May 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan L. Yuille
Liang-Chieh Chen
3DV
VLM
28
24
0
02 Apr 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
19
3
0
07 Mar 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Alan L. Yuille
Cihang Xie
ViT
MDE
19
4
0
05 Jan 2024
Point Deformable Network with Enhanced Normal Embedding for Point Cloud Analysis
Xingyilang Yin
Xi Yang
Liangchen Liu
Nannan Wang
Xinbo Gao
3DPC
18
3
0
20 Dec 2023
MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Abdullah Rashwan
Jiageng Zhang
A. Taalimi
Fan Yang
Xingyi Zhou
Chaochao Yan
Liang-Chieh Chen
Yeqing Li
ViT
17
5
0
11 Dec 2023
ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Fukun Yin
Xin Chen
C. Zhang
Biao Jiang
Zibo Zhao
Jiayuan Fan
Gang Yu
Taihao Li
Tao Chen
10
19
0
29 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
25
32
0
30 Oct 2023
BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile Screenshot Captioning
Ching-Yu Chiang
I-Hua Chang
Shih-Wei Liao
33
1
0
26 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
24
65
0
20 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Qi Han
Yuxuan Cai
Xiangyu Zhang
19
7
0
02 Sep 2023
Adaptive Frequency Filters As Efficient Global Token Mixers
Zhipeng Huang
Zhizheng Zhang
Cuiling Lan
Zhengjun Zha
Yan Lu
B. Guo
19
36
0
26 Jul 2023
Fourier-Mixed Window Attention: Accelerating Informer for Long Sequence Time-Series Forecasting
Nhat Tran
Jack Xin
AI4TS
23
3
0
02 Jul 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
23
27
0
01 Jun 2023
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
25
2
0
24 May 2023
Rethinking Local Perception in Lightweight Vision Transformer
Qi Fan
Huaibo Huang
Jiyang Guan
Ran He
ViT
13
25
0
31 Mar 2023
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
ViT
20
146
0
24 Mar 2023
EVA-02: A Visual Representation for Neon Genesis
Yuxin Fang
Quan-Sen Sun
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
ViT
CLIP
26
252
0
20 Mar 2023
Rethinking Mobile Block for Efficient Attention-based Models
Jiangning Zhang
Xiangtai Li
Jian Li
Liang Liu
Zhucun Xue
Boshen Zhang
Zhe Jiang
Tianxin Huang
Yabiao Wang
Chengjie Wang
MQ
44
78
0
03 Jan 2023
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
12
128
0
22 Nov 2022
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
Weijie Su
Xizhou Zhu
Chenxin Tao
Lewei Lu
Bin Li
Gao Huang
Yu Qiao
Xiaogang Wang
Jie Zhou
Jifeng Dai
20
31
0
17 Nov 2022
DETRs with Hybrid Matching
Ding Jia
Yuhui Yuan
Hao He
Xiao-pei Wu
Haojun Yu
Weihong Lin
Lei-huan Sun
Chao Zhang
Hanhua Hu
13
179
0
26 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
13
7
0
19 Jul 2022
TubeFormer-DeepLab: Video Mask Transformer
Dahun Kim
Jun Xie
Huiyu Wang
Siyuan Qiao
Qihang Yu
Hong-Seok Kim
Hartwig Adam
In So Kweon
Liang-Chieh Chen
ViT
MedIm
75
40
0
30 May 2022
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
184
1,148
0
05 Oct 2021
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
167
462
0
12 Aug 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
323
500
0
13 Jul 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
106
206
0
26 Apr 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
260
165
0
17 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
220
450
0
11 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
262
955
0
27 Jan 2021
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,214
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDE
BDL
PINN
193
14,190
0
07 Oct 2016
1
2
Next