Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.14222
Cited By
Rethinking and Improving Relative Position Encoding for Vision Transformer
29 July 2021
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking and Improving Relative Position Encoding for Vision Transformer"
50 / 163 papers shown
Title
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Feng Liu
Nicholas Chimitt
Lanqing guo
Jitesh Jain
Aditya Kane
...
Arun Ross
Humphrey Shi
Zhangyang Wang
A. Jain
Xiaoming Liu
CVBM
22
0
0
07 May 2025
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
M. Chowdhury
Md Rifat Ur Rahman
Akil Ahmad Taki
25
0
0
19 Apr 2025
Air Quality Prediction with A Meteorology-Guided Modality-Decoupled Spatio-Temporal Network
Hang Yin
Yan Zhang
Jian Xu
Jian-Long Chang
Y. Li
Cheng-Lin Liu
34
0
0
14 Apr 2025
Learning Object Focused Attention
Vivek Trivedy
A. Almalki
Longin Jan Latecki
31
0
0
10 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
62
0
0
03 Apr 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
34
0
0
31 Mar 2025
Stack Transformer Based Spatial-Temporal Attention Model for Dynamic Multi-Culture Sign Language Recognition
Koki Hirooka
Abu Saleh Musa Miah
Tatsuya Murakami
Yuto Akiba
Yong Seok Hwang
Jungpil Shin
SLR
54
0
0
21 Mar 2025
UniNet: A Unified Multi-granular Traffic Modeling Framework for Network Security
Binghui Wu
D. Divakaran
M. Gurusamy
57
0
0
06 Mar 2025
Partial Convolution Meets Visual Attention
Haiduo Huang
Fuwei Yang
D. Li
Ji Liu
Lu Tian
Jinzhang Peng
Pengju Ren
E. Barsoum
3DH
121
0
0
05 Mar 2025
Constrained Generative Modeling with Manually Bridged Diffusion Models
Saeid Naderiparizi
Xiaoxuan Liang
Berend Zwartsenberg
Frank D. Wood
DiffM
60
0
0
27 Feb 2025
Lightweight yet Efficient: An External Attentive Graph Convolutional Network with Positional Prompts for Sequential Recommendation
Jinyu Zhang
Chao Li
Zhongying Zhao
62
0
0
21 Feb 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
89
0
0
31 Dec 2024
Harmformer: Harmonic Networks Meet Transformers for Continuous Roto-Translation Equivariance
Tomáš Karella
Adam Harmanec
J. Kotera
Jan Blažek
F. Šroubek
21
1
0
06 Nov 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
32
4
0
18 Oct 2024
Toward Robust Real-World Audio Deepfake Detection: Closing the Explainability Gap
Georgia Channing
Juil Sock
Ronald Clark
Philip H. S. Torr
Christian Schroeder de Witt
30
2
0
09 Oct 2024
Tackling the Abstraction and Reasoning Corpus with Vision Transformers: the Importance of 2D Representation, Positions, and Objects
Wenhao Li
Yudong Xu
Scott Sanner
Elias Boutros Khalil
ViT
29
3
0
08 Oct 2024
3D-LSPTM: An Automatic Framework with 3D-Large-Scale Pretrained Model for Laryngeal Cancer Detection Using Laryngoscopic Videos
Meiyu Qiu
Y. Li
Wenjun Huang
Haoyun Zhang
Weiping Zheng
Wenbin Lei
Xiaomao Fan
18
0
0
02 Sep 2024
RI-MAE: Rotation-Invariant Masked AutoEncoders for Self-Supervised Point Cloud Representation Learning
Kunming Su
Qiuxia Wu
Panpan Cai
Xiaogang Zhu
Xuequan Lu
Zhiyong Wang
Kun Hu
3DPC
27
2
0
31 Aug 2024
Hierarchical Network Fusion for Multi-Modal Electron Micrograph Representation Learning with Foundational Large Language Models
Sakhinana Sagar Srinivas
Geethan Sannidhi
Venkataramana Runkana
30
0
0
24 Aug 2024
Positional Prompt Tuning for Efficient 3D Representation Learning
Shaochen Zhang
Zekun Qi
Runpei Dong
Xiuxiu Bai
Xing Wei
37
4
0
21 Aug 2024
MCPDepth: Omnidirectional Depth Estimation via Stereo Matching from Multi-Cylindrical Panoramas
Feng Qiao
Zhexiao Xiong
Xinge Zhu
Yuexin Ma
Qiumeng He
Nathan Jacobs
MDE
16
1
0
03 Aug 2024
Rethinking Attention Module Design for Point Cloud Analysis
Chengzhi Wu
Kaige Wang
Zeyun Zhong
Hao Fu
Junwei Zheng
Jiaming Zhang
Julius Pfrommer
Jürgen Beyerer
3DPC
44
1
0
27 Jul 2024
Transformer-based Single-Cell Language Model: A Survey
Wei Lan
Guohang He
Mingyang Liu
Qingfeng Chen
Junyue Cao
Wei Peng
MedIm
LRM
20
7
0
18 Jul 2024
Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation
Zhibin Lan
Liqiang Niu
Fandong Meng
Jie Zhou
Min Zhang
Jinsong Su
VLM
23
5
0
03 Jul 2024
PNeRV: A Polynomial Neural Representation for Videos
Sonam Gupta
S. Tomar
Grigorios G. Chrysos
Sukhendu Das
A. N. Rajagopalan
38
0
0
27 Jun 2024
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
34
2
0
22 May 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
30
9
0
22 May 2024
Pseudo Channel: Time Embedding for Motor Imagery Decoding
Zhengqing Miao
Meirong Zhao
16
1
0
21 May 2024
Semantically Consistent Video Inpainting with Conditional Diffusion Models
Dylan Green
William Harvey
Saeid Naderiparizi
Matthew Niedoba
Yunpeng Liu
...
Vasileios Lioutas
Setareh Dabiri
Adam Scibior
Berend Zwartsenberg
Frank D. Wood
DiffM
23
1
0
30 Apr 2024
Utilizing Large Language Models for Information Extraction from Real Estate Transactions
Yu Zhao
Haoxiang Gao
AILaw
40
9
0
28 Apr 2024
NeurIT: Pushing the Limit of Neural Inertial Tracking for Indoor Robotic IoT
Xinzhe Zheng
Sijie Ji
Yipeng Pan
Kaiwen Zhang
Chenshu Wu
19
1
0
13 Apr 2024
OmniSat: Self-Supervised Modality Fusion for Earth Observation
Guillaume Astruc
Nicolas Gonthier
Clement Mallet
Loic Landrieu
28
24
0
12 Apr 2024
HSViT: Horizontally Scalable Vision Transformer
Chenhao Xu
Chang-Tsun Li
Chee Peng Lim
Douglas Creighton
ViT
24
1
0
08 Apr 2024
Equipping Sketch Patches with Context-Aware Positional Encoding for Graphic Sketch Representation
Sicong Zang
Zhijun Fang
34
0
0
26 Mar 2024
KeyPoint Relative Position Encoding for Face Recognition
Minchul Kim
Yiyang Su
Feng Liu
Anil Jain
Xiaoming Liu
CVBM
32
7
0
21 Mar 2024
Rotary Position Embedding for Vision Transformer
Byeongho Heo
Song Park
Dongyoon Han
Sangdoo Yun
29
33
0
20 Mar 2024
Quantum Mixed-State Self-Attention Network
Fu Chen
Qinglin Zhao
Li Feng
Chuangtao Chen
Yangbin Lin
Jianhong Lin
34
5
0
05 Mar 2024
Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology
Wenhao Tang
Fengtao Zhou
Shengyue Huang
Xiang Zhu
Yi Zhang
Bo Liu
27
20
0
27 Feb 2024
Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding
Yu-Qi Yang
Yufeng Guo
Yang Liu
3DPC
35
2
0
22 Feb 2024
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics
Siqi Miao
Zhiyuan Lu
Mia Liu
Javier Duarte
Pan Li
34
4
0
19 Feb 2024
Exploring the Synergies of Hybrid CNNs and ViTs Architectures for Computer Vision: A survey
Haruna Yunusa
Shiyin Qin
Abdulrahman Hamman Adama Chukkol
Abdulganiyu Abdu Yusuf
Isah Bello
A. Lawan
ViT
17
12
0
05 Feb 2024
Towards Visual Syntactical Understanding
Sayeed Shafayet Chowdhury
Soumyadeep Chandra
Kaushik Roy
NAI
14
0
0
30 Jan 2024
MsSVT++: Mixed-scale Sparse Voxel Transformer with Center Voting for 3D Object Detection
Jianan Li
Shaocong Dong
Lihe Ding
Tingfa Xu
3DPC
19
7
0
22 Jan 2024
SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of Lumbar Spine MRI
Jiasong Chen
Linchen Qian
Linhai Ma
Timur Urakov
Weiyong Gu
Liang Liang
MedIm
29
4
0
17 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Xia Hu
20
100
0
02 Jan 2024
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
18
0
0
01 Dec 2023
Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent
Yuxiao Chen
Sander Tonkens
Marco Pavone
25
9
0
30 Nov 2023
Typhoon Intensity Prediction with Vision Transformer
Huanxin Chen
Pengshuai Yin
Huichou Huang
Qingyao Wu
Ruirui Liu
Xiatian Zhu
17
0
0
28 Nov 2023
Predicting Gradient is Better: Exploring Self-Supervised Learning for SAR ATR with a Joint-Embedding Predictive Architecture
Wei-Jang Li
Yang Wei
Tianpeng Liu
Yuenan Hou
Yuxuan Li
Zhen Liu
Yongxiang Liu
Li Liu
19
17
0
26 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
25
4
0
21 Nov 2023
1
2
3
4
Next