Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.15446
Cited By
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
27 March 2023
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming Yang
F. Khan
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications"
45 / 45 papers shown
Title
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
60
0
0
06 May 2025
An Adaptive Data-Resilient Multi-Modal Framework for Hierarchical Multi-Label Book Genre Identification
Utsav Nareti
S. Chattopadhyay
Prolay Mallick
Suraj Kumar
Ayush Vikas Daga
Chandranath Adak
Adarsh Wase
Arjab Roy
18
0
0
05 May 2025
The Fourth Monocular Depth Estimation Challenge
Anton Obukhov
Matteo Poggi
Fabio Tosi
Ripudaman Singh Arora
Jaime Spencer
...
Tuan-Anh Yang
Minh-Quang Nguyen
T. Tran
Albert Luginov
Muhammad Shahzad
MDE
55
0
0
24 Apr 2025
LSNet: See Large, Focus Small
Ao Wang
Hui Chen
Zijia Lin
J. Han
Guiguang Ding
37
0
0
29 Mar 2025
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Abdelrahman M. Shaker
Muhammad Maaz
Chenhui Gou
Hamid Rezatofighi
Salman Khan
F. Khan
70
0
0
27 Mar 2025
Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition
Shun Zou
Yi Zou
Mingya Zhang
Shipeng Luo
Zhihao Chen
Guangwei Gao
ViT
43
0
0
15 Mar 2025
Partial Convolution Meets Visual Attention
Haiduo Huang
Fuwei Yang
D. Li
Ji Liu
Lu Tian
Jinzhang Peng
Pengju Ren
E. Barsoum
3DH
112
0
0
05 Mar 2025
Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models
Andrew DiGiugno
Ausif Mahmood
33
0
0
24 Feb 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
67
0
0
26 Jan 2025
Rethinking Encoder-Decoder Flow Through Shared Structures
Frederik Laboyrie
M. K. Yucel
Albert Saà-Garriga
AI4CE
40
0
0
24 Jan 2025
SurgRIPE challenge: Benchmark of Surgical Robot Instrument Pose Estimation
Haozheng Xu
Alistair Weld
Chi Xu
Alfie Roddan
João Cartucho
...
Lucy Fothergill
Dominic Jones
Pietro Valdastri
Duygu Sarikaya
Stamatia Giannarou
27
1
0
06 Jan 2025
A Separable Self-attention Inspired by the State Space Model for Computer Vision
Juntao Zhang
Shaogeng Liu
Kun Bian
You Zhou
Pei Zhang
Jianning Liu
Jun Zhou
Bingyan Liu
Mamba
45
0
0
03 Jan 2025
RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations
Mingshu Zhao
Yi Luo
Yong Ouyang
31
0
0
27 Dec 2024
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
Xiaowen Ma
Zhenliang Ni
Xinghao Chen
Mamba
73
2
0
26 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
110
3
0
22 Nov 2024
Compositional Segmentation of Cardiac Images Leveraging Metadata
Abbas Khan
Muhammad Asad
Martin Benning
C. Roney
Gregory Slabaugh
26
0
0
30 Oct 2024
Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns
Ronghui Li
Hongwen Zhang
Yachao Zhang
Yuxiang Zhang
Youliang Zhang
Jie Guo
Yan Zhang
Xiu Li
Yebin Liu
30
6
0
27 Oct 2024
PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers in a resource-limited Context
Maximilian Augustin
Syed Shakib Sarwar
Mostafa Elhoushi
Sai Qian Zhang
Yuecheng Li
B. D. Salvo
20
0
0
23 Oct 2024
HRVMamba: High-Resolution Visual State Space Model for Dense Prediction
Hao Zhang
Yongqiang Ma
Wenqi Shao
Ping Luo
Nanning Zheng
Kaipeng Zhang
Mamba
28
1
0
04 Oct 2024
NimbleD: Enhancing Self-supervised Monocular Depth Estimation with Pseudo-labels and Large-scale Video Pre-training
Albert Luginov
Muhammad Shahzad
SSL
MDE
29
1
0
26 Aug 2024
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications
Tianfang Zhang
Lei Li
Yang Zhou
Wentao Liu
Chen Qian
Xiangyang Ji
ViT
28
9
0
07 Aug 2024
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
51
0
0
18 Jul 2024
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
Pierre-David Létourneau
Manish Kumar Singh
Hsin-Pai Cheng
Shizhong Han
Yunxiao Shi
Dalton Jones
M. H. Langston
Hong Cai
Fatih Porikli
32
0
0
16 Jul 2024
RepNeXt: A Fast Multi-Scale CNN using Structural Reparameterization
Mingshu Zhao
Yi Luo
Yong Ouyang
30
2
0
23 Jun 2024
Decoupling Forgery Semantics for Generalizable Deepfake Detection
Wei Ye
Xinan He
Feng Ding
30
8
0
14 Jun 2024
ToSA: Token Selective Attention for Efficient Vision Transformers
Manish Kumar Singh
R. Yasarla
Hong Cai
Mingu Lee
Fatih Porikli
44
0
0
13 Jun 2024
Convolution and Attention-Free Mamba-based Cardiac Image Segmentation
Abbas Khan
Muhammad Asad
Martin Benning
C. Roney
Gregory Slabaugh
Mamba
22
2
0
09 Jun 2024
Automatic Channel Pruning for Multi-Head Attention
Eunho Lee
Youngbae Hwang
ViT
32
1
0
31 May 2024
CUE-Net: Violence Detection Video Analytics with Spatial Cropping, Enhanced UniformerV2 and Modified Efficient Additive Attention
Damith Chamalke Senadeera
Xiaoyun Yang
Dimitrios Kollias
Gregory G. Slabaugh
27
0
0
27 Apr 2024
HSViT: Horizontally Scalable Vision Transformer
Chenhao Xu
Chang-Tsun Li
Chee Peng Lim
Douglas Creighton
ViT
21
1
0
08 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
PEM: Prototype-based Efficient MaskFormer for Image Segmentation
Niccolò Cavagnero
Gabriele Rosi
Claudia Cuttano
Francesca Pistilli
Marco Ciccone
Giuseppe Averta
Fabio Cermelli
38
21
0
29 Feb 2024
Crop and Couple: cardiac image segmentation using interlinked specialist networks
Abbas Khan
Muhammad Asad
Martin Benning
C. Roney
Gregory Slabaugh
27
3
0
14 Feb 2024
SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design
Seokju Yun
Youngmin Ro
ViT
34
29
0
29 Jan 2024
Single-sample versus case-control sampling scheme for Positive Unlabeled data: the story of two scenarios
Jan Mielniczuk
Adam Wawrzeñczyk
10
2
0
04 Dec 2023
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Pavan Kumar Anasosalu Vasu
Hadi Pouransari
Fartash Faghri
Raviteja Vemulapalli
Oncel Tuzel
CLIP
VLM
11
43
0
28 Nov 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
35
3
0
18 Aug 2023
RepViT: Revisiting Mobile CNN From ViT Perspective
Ao Wang
Hui Chen
Zijia Lin
Hengjun Pu
Guiguang Ding
27
169
0
18 Jul 2023
Spike-driven Transformer
Man Yao
Jiakui Hu
Zhaokun Zhou
Liuliang Yuan
Yonghong Tian
Boxing Xu
Guoqi Li
21
109
0
04 Jul 2023
Efficient Large-Scale Visual Representation Learning And Evaluation
Eden Dolev
A. Awad
Denisa Roberts
Zahra Ebrahimzadeh
Marcin Mejran
Vaibhav Malpani
Mahir Yavuz
30
0
0
22 May 2023
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
189
1,148
0
05 Oct 2021
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
172
462
0
12 Aug 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,214
0
17 Apr 2017
1