Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.03373
Cited By
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
7 July 2023
Chunhui Zhang
Xin Sun
Li Liu
Yiqian Yang
Qiong Liu
Xiaoping Zhou
Yanfeng Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment"
16 / 16 papers shown
Title
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
Chunhui Zhang
Li Liu
Jialin Gao
Xin Sun
Hao Wen
Xi Zhou
Shiming Ge
Y. Wang
28
0
0
02 Apr 2025
Towards General Multimodal Visual Tracking
Andong Lu
Mai Wen
Jinhu Wang
Yuanzhi Guo
Chenglong Li
Jin Tang
Bin Luo
33
0
0
14 Mar 2025
Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues
X. Feng
D. Zhang
Shuyan Hu
X. Li
M. Wu
Jie Zhang
Xiaojing Chen
K. Huang
28
0
0
27 Dec 2024
MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking
Chunhui Zhang
Li Liu
Hao-Kai Wen
Xi Zhou
Y. Wang
Mamba
84
2
0
24 Nov 2024
Underwater Camouflaged Object Tracking Meets Vision-Language SAM2
Chunhui Zhang
Li Liu
Guanjie Huang
Hao-Kai Wen
Xi Zhou
Xi Zhou
Shiming Ge
Y. Wang
20
0
0
25 Sep 2024
Autogenic Language Embedding for Coherent Point Tracking
Zikai Song
Ying Tang
Run Luo
Lintao Ma
Junqing Yu
Yi-Ping Phoebe Chen
Wei Yang
29
3
0
30 Jul 2024
WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark
Chunhui Zhang
Li Liu
Guanjie Huang
Hao-Kai Wen
Xi Zhou
Yanfeng Wang
32
1
0
30 May 2024
MEDBind: Unifying Language and Multimodal Medical Data Embeddings
Yuan Gao
Sangwook Kim
David E Austin
Chris McIntosh
18
2
0
19 Mar 2024
Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking
Jiawei Ge
Xiangmei Chen
Jiuxin Cao
Xueling Zhu
Bo Liu
VLM
10
2
0
28 Nov 2023
Single-Model and Any-Modality for Video Object Tracking
Zongwei Wu
Jilai Zheng
Xiangxuan Ren
Florin-Alexandru Vasluianu
Chao Ma
D. Paudel
Luc Van Gool
Radu Timofte
38
22
0
27 Nov 2023
WebUAV-3M: A Benchmark for Unveiling the Power of Million-Scale Deep UAV Tracking
Chunhui Zhang
Guanjie Huang
Li Liu
Shan Huang
Yinan Yang
Xiang Wan
Shiming Ge
Dacheng Tao
16
12
0
19 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
5,353
0
11 Nov 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
229
499
0
22 Apr 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
Siamese Box Adaptive Network for Visual Tracking
Zedu Chen
Bineng Zhong
Guorong Li
Shengping Zhang
Rongrong Ji
81
580
0
15 Mar 2020
TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild
Matthias Muller
Adel Bibi
Silvio Giancola
Salman Al-Subaihi
Bernard Ghanem
192
676
0
28 Mar 2018
1