ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.06625
31
0

Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking

9 March 2025
Chaocan Xue
Bineng Zhong
Qihua Liang
Yaozong Zheng
Ning Li
Yuanliang Xue
Shuxiang Song
ArXivPDFHTML
Abstract

Vision transformers (ViTs) have emerged as a popular backbone for visual tracking. However, complete ViT architectures are too cumbersome to deploy for unmanned aerial vehicle (UAV) tracking which extremely emphasizes efficiency. In this study, we discover that many layers within lightweight ViT-based trackers tend to learn relatively redundant and repetitive target representations. Based on this observation, we propose a similarity-guided layer adaptation approach to optimize the structure of ViTs. Our approach dynamically disables a large number of representation-similar layers and selectively retains only a single optimal layer among them, aiming to achieve a better accuracy-speed trade-off. By incorporating this approach into existing ViTs, we tailor previously complete ViT architectures into an efficient similarity-guided layer-adaptive framework, namely SGLATrack, for real-time UAV tracking. Extensive experiments on six tracking benchmarks verify the effectiveness of the proposed approach, and show that our SGLATrack achieves a state-of-the-art real-time speed while maintaining competitive tracking precision. Codes and models are available atthis https URL.

View on arXiv
@article{xue2025_2503.06625,
  title={ Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking },
  author={ Chaocan Xue and Bineng Zhong and Qihua Liang and Yaozong Zheng and Ning Li and Yuanliang Xue and Shuxiang Song },
  journal={arXiv preprint arXiv:2503.06625},
  year={ 2025 }
}
Comments on this paper