Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.09741
Cited By
Visual Attention Network
20 February 2022
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual Attention Network"
46 / 46 papers shown
Title
MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View
Liugang Lu
Dabin He
Congxiang Liu
Zhixiang Deng
44
0
0
25 Apr 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
110
1
0
27 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
141
51
0
21 Feb 2025
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Ashim Dahal
Saydul Akbar Murad
Nick Rahimi
ViT
29
1
0
14 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
28
0
0
12 Nov 2024
LGFN: Lightweight Light Field Image Super-Resolution using Local Convolution Modulation and Global Attention Feature Extraction
Zhongxin Yu
Liang Chen
Zhiyun Zeng
Kunping Yang
Shaofei Luo
Shaorui Chen
Cheng Zhong
SupR
20
0
0
26 Sep 2024
Gradients of Functions of Large Matrices
Nicholas Krämer
Pablo Moreno-Muñoz
Hrittik Roy
Søren Hauberg
27
0
0
27 May 2024
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
36
4
0
22 May 2024
AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Kin Wai Lau
Yasar Abbas Ur Rehman
L. Po
28
1
0
21 Apr 2024
Efficient Modulation for Vision Networks
Xu Ma
Xiyang Dai
Jianwei Yang
Bin Xiao
Yinpeng Chen
Yun Fu
Lu Yuan
33
17
0
29 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-based Non-invasive Digital System
Galib Muhammad Shahriar Himel
Md. Masudul Islam
Kh Abdullah Al-Aff
Shams Ibne Karim
Md. Kabir Uddin Sikder
MedIm
13
23
0
09 Jan 2024
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
Chao Chen
Tian Zhou
Yanjun Zhao
Hui Liu
Liang Sun
Rong Jin
25
0
0
06 Dec 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
31
35
0
30 Oct 2023
MuraNet: Multi-task Floor Plan Recognition with Relation Attention
Lingxiao Huang
Jung-Hsuan Wu
Chiching Wei
Wilson Li
16
2
0
01 Sep 2023
Real-time Automatic M-mode Echocardiography Measurement with Panel Attention from Local-to-Global Pixels
Ching-Hsun Tseng
S. Chien
Po-Shen Wang
Shin-Jye Lee
Wei-Huan Hu
Bin Pu
Xiaojun Zeng
19
1
0
15 Aug 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
25
27
0
01 Jun 2023
StageInteractor: Query-based Object Detector with Cross-stage Interaction
Yao Teng
Haisong Liu
Sheng Guo
Limin Wang
ObjD
24
8
0
11 Apr 2023
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
34
117
0
29 Mar 2023
Equiangular Basis Vectors
Yang Shen
Xuhao Sun
Xiuying Wei
28
7
0
21 Mar 2023
Long Range Pooling for 3D Large-Scale Scene Understanding
Xiang-Li Li
Meng-Hao Guo
Tai-Jiang Mu
Ralph Robert Martin
Shiyong Hu
3DV
3DPC
6
2
0
17 Jan 2023
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Keyu Tian
Yi-Xin Jiang
Qishuai Diao
Chen Lin
Liwei Wang
Zehuan Yuan
11
98
0
09 Jan 2023
LOANet: A Lightweight Network Using Object Attention for Extracting Buildings and Roads from UAV Aerial Remote Sensing Images
Xiaoxiang Han
Yiman Liu
Gang Liu
Yuanjie Lin
Qiaohong Liu
22
11
0
16 Dec 2022
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
12
155
0
24 Oct 2022
ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors
Chenjie Cao
Qiaole Dong
Yanwei Fu
25
30
0
12 Oct 2022
LGC-Net: A Lightweight Gyroscope Calibration Network for Efficient Attitude Estimation
Yaohua Liu
Wei Liang
Jinqiang Cui
10
7
0
19 Sep 2022
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
Cheng Tan
Zhangyang Gao
Lirong Wu
Yongjie Xu
Jun-Xiong Xia
Siyuan Li
Stan Z. Li
25
102
0
24 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
28
32
0
19 Jun 2022
Efficient Progressive High Dynamic Range Image Restoration via Attention and Alignment Network
G. Yu
Jin Zhang
Zhe Ma
Hongbin Wang
18
7
0
20 Apr 2022
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Z. Li
Cheng Lu
Jia Qin
Chunle Guo
Mingg-Ming Cheng
41
149
0
06 Apr 2022
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
22
261
0
22 Mar 2022
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
X. Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
138
703
0
28 Jan 2022
SWAT: Spatial Structure Within and Among Tokens
Kumara Kahatapitiya
Michael S. Ryoo
12
6
0
26 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
77
96
0
07 Nov 2021
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
51
134
0
09 Sep 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
331
500
0
13 Jul 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,490
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
260
178
0
17 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
267
955
0
27 Jan 2021
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
190
3,480
0
20 Aug 2019
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,214
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
249
1,817
0
18 Aug 2016
1