ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.09741
  4. Cited By
Visual Attention Network

Visual Attention Network

20 February 2022
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
    ViT
    VLM
ArXivPDFHTML

Papers citing "Visual Attention Network"

46 / 46 papers shown
Title
MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View
MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View
Liugang Lu
Dabin He
Congxiang Liu
Zhixiang Deng
44
0
0
25 Apr 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
110
1
0
27 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
141
51
0
21 Feb 2025
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery
Ashim Dahal
Saydul Akbar Murad
Nick Rahimi
ViT
29
1
0
14 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
28
0
0
12 Nov 2024
LGFN: Lightweight Light Field Image Super-Resolution using Local
  Convolution Modulation and Global Attention Feature Extraction
LGFN: Lightweight Light Field Image Super-Resolution using Local Convolution Modulation and Global Attention Feature Extraction
Zhongxin Yu
Liang Chen
Zhiyun Zeng
Kunping Yang
Shaofei Luo
Shaorui Chen
Cheng Zhong
SupR
20
0
0
26 Sep 2024
Gradients of Functions of Large Matrices
Gradients of Functions of Large Matrices
Nicholas Krämer
Pablo Moreno-Muñoz
Hrittik Roy
Søren Hauberg
27
0
0
27 May 2024
Vision Transformer with Sparse Scan Prior
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
36
4
0
22 May 2024
AudioRepInceptionNeXt: A lightweight single-stream architecture for
  efficient audio recognition
AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Kin Wai Lau
Yasar Abbas Ur Rehman
L. Po
28
1
0
21 Apr 2024
Efficient Modulation for Vision Networks
Efficient Modulation for Vision Networks
Xu Ma
Xiyang Dai
Jianwei Yang
Bin Xiao
Yinpeng Chen
Yun Fu
Lu Yuan
33
17
0
29 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
Skin Cancer Segmentation and Classification Using Vision Transformer for
  Automatic Analysis in Dermatoscopy-based Non-invasive Digital System
Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-based Non-invasive Digital System
Galib Muhammad Shahriar Himel
Md. Masudul Islam
Kh Abdullah Al-Aff
Shams Ibne Karim
Md. Kabir Uddin Sikder
MedIm
13
23
0
09 Jan 2024
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
Chao Chen
Tian Zhou
Yanjun Zhao
Hui Liu
Liang Sun
Rong Jin
25
0
0
06 Dec 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
31
35
0
30 Oct 2023
MuraNet: Multi-task Floor Plan Recognition with Relation Attention
MuraNet: Multi-task Floor Plan Recognition with Relation Attention
Lingxiao Huang
Jung-Hsuan Wu
Chiching Wei
Wilson Li
16
2
0
01 Sep 2023
Real-time Automatic M-mode Echocardiography Measurement with Panel
  Attention from Local-to-Global Pixels
Real-time Automatic M-mode Echocardiography Measurement with Panel Attention from Local-to-Global Pixels
Ching-Hsun Tseng
S. Chien
Po-Shen Wang
Shin-Jye Lee
Wei-Huan Hu
Bin Pu
Xiaojun Zeng
19
1
0
15 Aug 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
25
27
0
01 Jun 2023
StageInteractor: Query-based Object Detector with Cross-stage
  Interaction
StageInteractor: Query-based Object Detector with Cross-stage Interaction
Yao Teng
Haisong Liu
Sheng Guo
Limin Wang
ObjD
24
8
0
11 Apr 2023
InceptionNeXt: When Inception Meets ConvNeXt
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
34
117
0
29 Mar 2023
Equiangular Basis Vectors
Equiangular Basis Vectors
Yang Shen
Xuhao Sun
Xiuying Wei
28
7
0
21 Mar 2023
Long Range Pooling for 3D Large-Scale Scene Understanding
Long Range Pooling for 3D Large-Scale Scene Understanding
Xiang-Li Li
Meng-Hao Guo
Tai-Jiang Mu
Ralph Robert Martin
Shiyong Hu
3DV
3DPC
6
2
0
17 Jan 2023
Designing BERT for Convolutional Networks: Sparse and Hierarchical
  Masked Modeling
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Keyu Tian
Yi-Xin Jiang
Qishuai Diao
Chen Lin
Liwei Wang
Zehuan Yuan
11
98
0
09 Jan 2023
LOANet: A Lightweight Network Using Object Attention for Extracting
  Buildings and Roads from UAV Aerial Remote Sensing Images
LOANet: A Lightweight Network Using Object Attention for Extracting Buildings and Roads from UAV Aerial Remote Sensing Images
Xiaoxiang Han
Yiman Liu
Gang Liu
Yuanjie Lin
Qiaohong Liu
22
11
0
16 Dec 2022
MetaFormer Baselines for Vision
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
12
155
0
24 Oct 2022
ZITS++: Image Inpainting by Improving the Incremental Transformer on
  Structural Priors
ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors
Chenjie Cao
Qiaole Dong
Yanwei Fu
25
30
0
12 Oct 2022
LGC-Net: A Lightweight Gyroscope Calibration Network for Efficient
  Attitude Estimation
LGC-Net: A Lightweight Gyroscope Calibration Network for Efficient Attitude Estimation
Yaohua Liu
Wei Liang
Jinqiang Cui
10
7
0
19 Sep 2022
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive
  Learning
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
Cheng Tan
Zhangyang Gao
Lirong Wu
Yongjie Xu
Jun-Xiong Xia
Siyuan Li
Stan Z. Li
25
102
0
24 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
28
32
0
19 Jun 2022
Efficient Progressive High Dynamic Range Image Restoration via Attention
  and Alignment Network
Efficient Progressive High Dynamic Range Image Restoration via Attention and Alignment Network
G. Yu
Jin Zhang
Zhe Ma
Hongbin Wang
18
7
0
20 Apr 2022
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Z. Li
Cheng Lu
Jia Qin
Chunle Guo
Mingg-Ming Cheng
41
149
0
06 Apr 2022
Focal Modulation Networks
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
22
261
0
22 Mar 2022
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
X. Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
138
703
0
28 Jan 2022
SWAT: Spatial Structure Within and Among Tokens
SWAT: Spatial Structure Within and Among Tokens
Kumara Kahatapitiya
Michael S. Ryoo
12
6
0
26 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
77
96
0
07 Nov 2021
Is Attention Better Than Matrix Decomposition?
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
51
134
0
09 Sep 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
331
500
0
13 Jul 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,490
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
260
178
0
17 Feb 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
267
955
0
27 Jan 2021
Deep High-Resolution Representation Learning for Visual Recognition
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
190
3,480
0
20 Aug 2019
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,214
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
249
1,817
0
18 Aug 2016
1