Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.05909
Cited By
Stand-Alone Self-Attention in Vision Models
13 June 2019
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
VLM
SLR
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stand-Alone Self-Attention in Vision Models"
50 / 217 papers shown
Title
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Feng Liu
Nicholas Chimitt
Lanqing guo
Jitesh Jain
Aditya Kane
...
Arun Ross
Humphrey Shi
Zhangyang Wang
A. Jain
Xiaoming Liu
CVBM
27
1
0
07 May 2025
Back to Fundamentals: Low-Level Visual Features Guided Progressive Token Pruning
Yuanbing Ouyang
Yizhuo Liang
Qingpeng Li
Xinfei Guo
Yiming Luo
Di Wu
Hao Wang
Yushan Pan
ViT
VLM
73
0
0
25 Apr 2025
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
42
0
0
03 Apr 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
115
1
0
27 Feb 2025
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
Zeyu Yang
Nan Song
Wei Li
Xiatian Zhu
L. Zhang
Philip H. S. Torr
77
4
0
24 Feb 2025
Rethinking Attention Module Design for Point Cloud Analysis
Chengzhi Wu
Kaige Wang
Zeyun Zhong
Hao Fu
Junwei Zheng
Jiaming Zhang
Julius Pfrommer
Jürgen Beyerer
3DPC
46
1
0
27 Jul 2024
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
Liangyan Jiang
Chuang Zhu
Yanxu Chen
50
2
0
22 Jul 2024
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
Xiang Zhang
Yulun Zhang
Fisher Yu
37
15
0
08 Jul 2024
The Balanced-Pairwise-Affinities Feature Transform
Daniel Shalam
Simon Korman
35
0
0
25 Jun 2024
SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs
Jing Yang
Kyle Fogarty
Fangcheng Zhong
Cengiz Oztireli
47
1
0
10 Jun 2024
Steerable Transformers
Soumyabrata Kundu
Risi Kondor
ViT
LLMSV
30
1
0
24 May 2024
Compression-Realized Deep Structural Network for Video Quality Enhancement
Hanchi Sun
Xiaohong Liu
Xinyang Jiang
Yifei Shen
Dongsheng Li
Xiongkuo Min
Guangtao Zhai
32
1
0
10 May 2024
When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite Work Out
Tristan Piater
Niklas Penzel
Gideon Stein
Joachim Denzler
39
2
0
18 Apr 2024
CascadedGaze: Efficiency in Global Context Extraction for Image Restoration
Amirhosein Ghasemabadi
Muhammad Kamran Janjua
Mohammad Salameh
Chunhua Zhou
Fengyu Sun
Di Niu
32
11
0
26 Jan 2024
Fast Registration of Photorealistic Avatars for VR Facial Animation
Chaitanya Patel
Shaojie Bai
Tenia Wang
Jason M. Saragih
S. Wei
23
0
0
19 Jan 2024
Integrating Human Vision Perception in Vision Transformers for Classifying Waste Items
Akshat Shrivastava
Tapan K. Gandhi
24
1
0
19 Dec 2023
Delving Deeper Into Astromorphic Transformers
Md. Zesun Ahmed Mia
Malyaban Bal
Abhronil Sengupta
34
1
0
18 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
33
0
0
01 Dec 2023
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
22
2
0
01 Nov 2023
GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers
Takeru Miyato
Bernhard Jaeger
Max Welling
Andreas Geiger
ViT
36
14
0
16 Oct 2023
Real-time Automatic M-mode Echocardiography Measurement with Panel Attention from Local-to-Global Pixels
Ching-Hsun Tseng
S. Chien
Po-Shen Wang
Shin-Jye Lee
Wei-Huan Hu
Bin Pu
Xiaojun Zeng
19
1
0
15 Aug 2023
Seed Kernel Counting using Domain Randomization and Object Tracking Neural Networks
Venkat Margapuri
Prapti Thapaliya
Michael L. Neilsen
35
0
0
10 Aug 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marçal Rusiñol
8
6
0
11 May 2023
Early Detection of Alzheimer's Disease using Bottleneck Transformers
Arunima Jaiswal
Ananya Sadana
MedIm
18
2
0
01 May 2023
AutoFocusFormer: Image Segmentation off the Grid
Chen Ziwen
K. Patnaik
Shuangfei Zhai
Alvin Wan
Zhile Ren
A. Schwing
Alex Colburn
Li Fuxin
17
9
0
24 Apr 2023
How Will It Drape Like? Capturing Fabric Mechanics from Depth Images
Carlos Rodriguez-Pardo
Melania Prieto-Martin
Dan Casas
Elena Garces
28
12
0
13 Apr 2023
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
48
117
0
29 Mar 2023
Machine Learning for Brain Disorders: Transformers and Visual Transformers
Robin Courant
Maika Edberg
Nicolas Dufour
Vicky Kalogeiton
MedIm
ViT
27
1
0
21 Mar 2023
AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+
Xiao Wang
Ying Wang
Ziwei Xuan
Guo-Jun Qi
ViT
42
3
0
14 Mar 2023
Efficient and Explicit Modelling of Image Hierarchies for Image Restoration
Yawei Li
Yuchen Fan
Xiaoyu Xiang
D. Demandolx
Rakesh Ranjan
Radu Timofte
Luc Van Gool
21
173
0
01 Mar 2023
Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
Jun Yang
Lizhi Bai
Yaoru Sun
Chunqi Tian
Maoyu Mao
Guorun Wang
SSeg
18
16
0
23 Feb 2023
STB-VMM: Swin Transformer Based Video Motion Magnification
Ricard Lado-Roigé
M. A. Pérez
18
13
0
20 Feb 2023
Hyneter: Hybrid Network Transformer for Object Detection
Dong Chen
Duoqian Miao
Xuepeng Zhao
ViT
27
3
0
18 Feb 2023
Convolution-enhanced Evolving Attention Networks
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
J. Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
19
6
0
16 Dec 2022
Gaussian Radar Transformer for Semantic Segmentation in Noisy Radar Data
Matthias Zeller
Jens Behley
Michael Heidingsfeld
C. Stachniss
24
23
0
07 Dec 2022
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
28
2
0
29 Nov 2022
Semantic-Aware Local-Global Vision Transformer
Jiatong Zhang
Zengwei Yao
Fanglin Chen
Guangming Lu
Wenjie Pei
ViT
23
0
0
27 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
31
129
0
22 Nov 2022
Patch-level Gaze Distribution Prediction for Gaze Following
Qiaomu Miao
Minh Hoai
Dimitris Samaras
11
16
0
20 Nov 2022
Multi-Camera Multi-Object Tracking on the Move via Single-Stage Global Association Approach
Pha Nguyen
Kha Gia Quach
C. Duong
S. L. Phung
Ngan Le
Khoa Luu
44
12
0
17 Nov 2022
Prompt Tuning for Parameter-efficient Medical Image Segmentation
Marc Fischer
Alexander Bartler
Bin Yang
SSeg
14
18
0
16 Nov 2022
FedTP: Federated Learning by Transformer Personalization
Hongxia Li
Zhongyi Cai
Jingya Wang
Jiangnan Tang
Weiping Ding
Chin-Teng Lin
Ye-ling Shi
FedML
32
59
0
03 Nov 2022
Valuing Vicinity: Memory attention framework for context-based semantic segmentation in histopathology
Oliver Ester
Fabian Horst
C. Seibold
J. Keyl
Saskia Ting
...
P. Ivanyi
Viktor Grünwald
J. Bräsen
Jan Egger
Jens Kleesiek
19
7
0
21 Oct 2022
Reconstructed Student-Teacher and Discriminative Networks for Anomaly Detection
Shinji Yamada
Satoshi Kamiya
Kazuhiro Hotta
23
29
0
14 Oct 2022
SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
Pei Sun
Mingxing Tan
Weiyue Wang
Chenxi Liu
Fei Xia
Zhaoqi Leng
Drago Anguelov
ViT
21
114
0
13 Oct 2022
ConvTransSeg: A Multi-resolution Convolution-Transformer Network for Medical Image Segmentation
Zhendi Gong
Andrew P. French
Guoping Qiu
Xin Chen
ViT
MedIm
32
6
0
13 Oct 2022
DCANet: Differential Convolution Attention Network for RGB-D Semantic Segmentation
Lizhi Bai
Jun Yang
Chunqi Tian
Yaoru Sun
Maoyu Mao
Yanjun Xu
Weirong Xu
8
9
0
13 Oct 2022
Attention-Based Generative Neural Image Compression on Solar Dynamics Observatory
Ali Zafari
Atefeh Khoshkhahtinat
P. Mehta
Nasser M. Nasrabadi
B. Thompson
D. D. Silva
M. Kirk
13
9
0
12 Oct 2022
Centralized Feature Pyramid for Object Detection
Yu Quan
Dong Zhang
Liyan Zhang
Jinhui Tang
ObjD
26
147
0
05 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
33
58
0
04 Oct 2022
1
2
3
4
5
Next