ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.16746
  4. Cited By
Towards More Flexible and Accurate Object Tracking with Natural
  Language: Algorithms and Benchmark

Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark

Computer Vision and Pattern Recognition (CVPR), 2021
31 March 2021
Tianlin Li
Xiujun Shu
Zhipeng Zhang
Bo Jiang
Yaowei Wang
Yonghong Tian
Feng Wu
ArXiv (abs)PDFHTML

Papers citing "Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark"

50 / 89 papers shown
Title
UniSOT: A Unified Framework for Multi-Modality Single Object Tracking
UniSOT: A Unified Framework for Multi-Modality Single Object TrackingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yinchao Ma
Yuyang Tang
Wenfei Yang
Tianzhu Zhang
Xu Zhou
Feng Wu
220
1
0
03 Nov 2025
PlanarTrack: A high-quality and challenging benchmark for large-scale planar object tracking
PlanarTrack: A high-quality and challenging benchmark for large-scale planar object trackingComputer Vision and Image Understanding (CVIU), 2025
Yifan Jiao
Xinran Liu
Xiaoqiong Liu
Xiaohui Yuan
Heng Fan
Libo Zhang
89
0
0
27 Oct 2025
SAM 2++: Tracking Anything at Any Granularity
SAM 2++: Tracking Anything at Any Granularity
J. Zhang
C. Liang
Yichun Yang
Chenkai Zeng
Yutao Cui
Xinwen Zhang
Xin Zhou
Kai Ma
Gangshan Wu
Limin Wang
218
0
0
21 Oct 2025
From Seeing to Predicting: A Vision-Language Framework for Trajectory Forecasting and Controlled Video Generation
From Seeing to Predicting: A Vision-Language Framework for Trajectory Forecasting and Controlled Video Generation
Fan Yang
Z. Chen
Yousong Zhu
Xin Li
Jinqiao Wang
VGen
98
0
0
01 Oct 2025
Omni Survey for Multimodality Analysis in Visual Object Tracking
Omni Survey for Multimodality Analysis in Visual Object Tracking
Zhangyong Tang
Tianyang Xu
Xuefeng Zhu
Hui Li
Shaochuan Zhao
Tao Zhou
Chunyang Cheng
Xiaojun Wu
Josef Kittler
176
2
0
18 Aug 2025
Multi-State Tracker: Enhancing Efficient Object Tracking via Multi-State Specialization and Interaction
Multi-State Tracker: Enhancing Efficient Object Tracking via Multi-State Specialization and Interaction
Shilei Wang
Gong Cheng
Pujian Lai
Dong Gao
Junwei Han
78
0
0
15 Aug 2025
SOI is the Root of All Evil: Quantifying and Breaking Similar Object Interference in Single Object Tracking
SOI is the Root of All Evil: Quantifying and Breaking Similar Object Interference in Single Object Tracking
Yipei Wang
Shiyu Hu
Shukun Jia
Panxi Xu
Hongfei Ma
Yiping Ma
Jing Zhang
Xiaobo Lu
Xin Zhao
129
0
0
13 Aug 2025
ReasoningTrack: Chain-of-Thought Reasoning for Long-term Vision-Language Tracking
ReasoningTrack: Chain-of-Thought Reasoning for Long-term Vision-Language Tracking
Xiao Wang
Liye Jin
Xufeng Lou
Shiao Wang
Lan Chen
Bo Jiang
Zhipeng Zhang
LRM
116
2
0
07 Aug 2025
Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking
Decoupled Spatio-Temporal Consistency Learning for Self-Supervised TrackingAAAI Conference on Artificial Intelligence (AAAI), 2025
Yaozong Zheng
Bineng Zhong
Qihua Liang
Ning Li
Shuxiang Song
250
23
0
29 Jul 2025
Towards Universal Modal Tracking with Online Dense Temporal Token Learning
Towards Universal Modal Tracking with Online Dense Temporal Token LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yaozong Zheng
Bineng Zhong
Qihua Liang
Shengping Zhang
Guorong Li
Xianxian Li
Rongrong Ji
169
20
0
27 Jul 2025
Region-based Cluster Discrimination for Visual Representation Learning
Region-based Cluster Discrimination for Visual Representation Learning
Yin Xie
Kaicheng Yang
Xiang An
Kun Wu
Yongle Zhao
...
Yumeng Wang
Ziyong Feng
Roy Miles
Ismail Elezi
Jiankang Deng
ObjDVLM
188
2
0
26 Jul 2025
ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking
ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking
X. Feng
Shuyan Hu
X. Li
D. Zhang
M. Wu
Jie Zhang
Xiaosha Chen
K. Huang
178
3
0
26 Jul 2025
Explicit Context Reasoning with Supervision for Visual Tracking
Explicit Context Reasoning with Supervision for Visual Tracking
Fansheng Zeng
Bineng Zhong
Haiying Xia
Yufei Tan
Xiantao Hu
Liangtao Shi
Shuxiang Song
153
2
0
22 Jul 2025
MVTD: A Benchmark Dataset for Maritime Visual Object Tracking
MVTD: A Benchmark Dataset for Maritime Visual Object Tracking
A. B. Bakht
Muhayy ud Din
Sajid Javed
Irfan Hussain
180
0
0
03 Jun 2025
Progressive Scaling Visual Object Tracking
Progressive Scaling Visual Object Tracking
Jack Hong
Shilin Yan
Zehao Xiao
Jiayin Cai
Xiaolong Jiang
Yao Hu
Henghui Ding
311
1
0
26 May 2025
CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features
CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features
X. Feng
D. Zhang
Shuyan Hu
X. Li
M. Wu
Jie Zhang
Xiaosha Chen
Kexin Huang
189
4
0
26 May 2025
Efficient Motion Prompt Learning for Robust Visual Tracking
Efficient Motion Prompt Learning for Robust Visual Tracking
Jie Zhao
Xin Chen
Yongsheng Yuan
Michael Felsberg
Dong Wang
Huchuan Lu
174
1
0
22 May 2025
Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach
Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach
Shiao Wang
Xiao Wang
Liye Jin
Bo Jiang
Lin Zhu
Lan Chen
Yonghong Tian
Bin Luo
262
1
0
19 May 2025
Diff-MM: Exploring Pre-trained Text-to-Image Generation Model for Unified Multi-modal Object Tracking
Diff-MM: Exploring Pre-trained Text-to-Image Generation Model for Unified Multi-modal Object Tracking
Shiyu Xuan
Zechao Li
Jinhui Tang
305
1
0
19 May 2025
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
COST: Contrastive One-Stage Transformer for Vision-Language Small Object TrackingInformation Fusion (Inf. Fusion), 2025
Chunhui Zhang
Li Liu
Jialin Gao
Xin Sun
Hao Wen
Xi Zhou
Shiming Ge
Yucheng Wang
276
4
0
02 Apr 2025
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual TrackingComputer Vision and Pattern Recognition (CVPR), 2025
Wenrui Cai
Qingjie Liu
Longji Xu
MoE
366
3
0
24 Mar 2025
Towards General Multimodal Visual Tracking
Andong Lu
Mai Wen
Jinhu Wang
Yuanzhi Guo
Chenglong Li
Jin Tang
Bin Luo
190
1
0
14 Mar 2025
MITracker: Multi-View Integration for Visual Object Tracking
MITracker: Multi-View Integration for Visual Object TrackingComputer Vision and Pattern Recognition (CVPR), 2025
Mengjie Xu
Yitao Zhu
Haotian Jiang
Jiaming Li
Zhenrong Shen
...
Haolin Huang
Xinyu Wang
Qing Yang
H. Zhang
Qian Wang
268
2
0
27 Feb 2025
Enhancing Vision-Language Tracking by Effectively Converting Textual
  Cues into Visual Cues
Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues
X. Feng
D. Zhang
Shuyan Hu
X. Li
M. Wu
Jie Zhang
Xiaojing Chen
K. Huang
253
5
0
27 Dec 2024
Exploring Enhanced Contextual Information for Video-Level Object
  Tracking
Exploring Enhanced Contextual Information for Video-Level Object TrackingAAAI Conference on Artificial Intelligence (AAAI), 2024
Ben Kang
Xin Chen
Simiao Lai
Yang Liu
Y. Liu
Dong Wang
Mamba
312
26
0
15 Dec 2024
GSOT3D: Towards Generic 3D Single Object Tracking in the Wild
GSOT3D: Towards Generic 3D Single Object Tracking in the Wild
Yifan Jiao
Yunhao Li
Junhua Ding
Q. Yang
Song Fu
Heng Fan
Libo Zhang
3DPC
245
0
0
03 Dec 2024
How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language
  in Vision-Language Tracking
How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking
Xuchen Li
Shiyu Hu
Xiaokun Feng
Dailing Zhang
Meiqi Wu
Jing Zhang
Kaiqi Huang
373
9
0
23 Nov 2024
NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tracking
NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object TrackingAsian Conference on Computer Vision (ACCV), 2024
Yu Liu
Arif Mahmood
Muhammad Haris Khan
216
5
0
27 Oct 2024
Depth Attention for Robust RGB Tracking
Depth Attention for Robust RGB TrackingAsian Conference on Computer Vision (ACCV), 2024
Yu Liu
Arif Mahmood
Muhammad Haris Khan
VOSMDE
305
1
0
27 Oct 2024
DINTR: Tracking via Diffusion-based Interpolation
DINTR: Tracking via Diffusion-based InterpolationNeural Information Processing Systems (NeurIPS), 2024
Pha Nguyen
Ngan Le
J. Cothren
Alper Yilmaz
Khoa Luu
DiffM
317
3
0
14 Oct 2024
SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking
  Neural Networks
SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking Neural Networks
Haiyang Wang
Qian Zhu
Mowen She
Yabo Li
Haoyu Song
Minghe Xu
Xiao Wang
ViT
174
1
0
10 Oct 2024
DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking
  Based on LLM
DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
Xuchen Li
Shiyu Hu
Xiaokun Feng
Dailing Zhang
Meiqi Wu
Jing Zhang
Kaiqi Huang
263
13
0
03 Oct 2024
Visual Language Tracking with Multi-modal Interaction: A Robust
  Benchmark
Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
Xuchen Li
Shiyu Hu
Xiaokun Feng
Dailing Zhang
Meiqi Wu
Jing Zhang
Kaiqi Huang
VLMMLLM
173
14
0
13 Sep 2024
Improving Network Interpretability via Explanation Consistency
  Evaluation
Improving Network Interpretability via Explanation Consistency EvaluationIEEE transactions on multimedia (IEEE TMM), 2024
Hefeng Wu
Hao Jiang
Keze Wang
Ziyi Tang
Xianghuan He
Liang Lin
FAttAAML
293
3
0
08 Aug 2024
Autogenic Language Embedding for Coherent Point Tracking
Autogenic Language Embedding for Coherent Point Tracking
Zikai Song
Ying Tang
Run Luo
Lintao Ma
Junqing Yu
Yi-Ping Phoebe Chen
Wei Yang
260
10
0
30 Jul 2024
ActionVOS: Actions as Prompts for Video Object Segmentation
ActionVOS: Actions as Prompts for Video Object Segmentation
Liangyang Ouyang
Ruicong Liu
Yifei Huang
Ryosuke Furuta
Yoichi Sato
VOS
203
8
0
10 Jul 2024
WebUOT-1M: Advancing Deep Underwater Object Tracking with A
  Million-Scale Benchmark
WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark
Chunhui Zhang
Li Liu
Guanjie Huang
Hao Wen
Xi Zhou
Yanfeng Wang
326
22
0
30 May 2024
Awesome Multi-modal Object Tracking
Awesome Multi-modal Object Tracking
Chunhui Zhang
Li Liu
Hao Wen
Xi Zhou
Yanfeng Wang
VOT
230
9
0
23 May 2024
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on
  LLM
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM
Xuchen Li
Xiaokun Feng
Shiyu Hu
Meiqi Wu
Dailing Zhang
Jing Zhang
Kaiqi Huang
VLM
206
31
0
20 May 2024
Spatio-Temporal Side Tuning Pre-trained Foundation Models for
  Video-based Pedestrian Attribute Recognition
Spatio-Temporal Side Tuning Pre-trained Foundation Models for Video-based Pedestrian Attribute Recognition
Tianlin Li
Qian Zhu
Jiandong Jin
Jun Zhu
Futian Wang
Bowei Jiang
Yaowei Wang
Yonghong Tian
ViT
280
9
0
27 Apr 2024
MLS-Track: Multilevel Semantic Interaction in RMOT
MLS-Track: Multilevel Semantic Interaction in RMOT
Zeliang Ma
Yang Song
Zhe Cui
Zhicheng Zhao
Fei Su
Delong Liu
Jingyu Wang
207
8
0
18 Apr 2024
LRR: Language-Driven Resamplable Continuous Representation against
  Adversarial Tracking Attacks
LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking AttacksInternational Conference on Learning Representations (ICLR), 2024
Jianlang Chen
Xuhong Ren
Qing Guo
Felix Juefei Xu
Di Lin
Wei Feng
Lei Ma
Jianjun Zhao
220
6
0
09 Apr 2024
RTracker: Recoverable Tracking via PN Tree Structured Memory
RTracker: Recoverable Tracking via PN Tree Structured Memory
Yuqing Huang
Xin Li
Zikun Zhou
Yaowei Wang
Zhenyu He
Ming-Hsuan Yang
275
16
0
28 Mar 2024
Exploring Dynamic Transformer for Efficient Object Tracking
Exploring Dynamic Transformer for Efficient Object Tracking
Jiawen Zhu
Xin Chen
Haiwen Diao
Shuai Li
Jun-Yan He
Chenyang Li
Bin Luo
Dong Wang
Huchuan Lu
376
12
0
26 Mar 2024
Multi-attention Associate Prediction Network for Visual Tracking
Multi-attention Associate Prediction Network for Visual Tracking
Xinglong Sun
Haijiang Sun
Shan Jiang
Jiacheng Wang
Xilai Wei
Zhonghe Hu
276
6
0
25 Mar 2024
Autoregressive Queries for Adaptive Tracking with
  Spatio-TemporalTransformers
Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformersComputer Vision and Pattern Recognition (CVPR), 2024
Jinxia Xie
Bineng Zhong
Zhiyi Mo
Shengping Zhang
Liangtao Shi
Shuxiang Song
Rongrong Ji
312
107
0
15 Mar 2024
OneTracker: Unifying Visual Object Tracking with Foundation Models and
  Efficient Tuning
OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient TuningComputer Vision and Pattern Recognition (CVPR), 2024
Lingyi Hong
Shilin Yan
Renrui Zhang
Wanyun Li
Xinyu Zhou
...
Kaixun Jiang
Yiting Chen
Jinglun Li
Zhaoyu Chen
Wenqiang Zhang
VLM
218
115
0
14 Mar 2024
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance
Tracking Meets LoRA: Faster Training, Larger Model, Stronger PerformanceEuropean Conference on Computer Vision (ECCV), 2024
Liting Lin
Heng Fan
Zhipeng Zhang
Yaowei Wang
Yong-mei Xu
Haibin Ling
328
86
0
08 Mar 2024
VastTrack: Vast Category Visual Object Tracking
VastTrack: Vast Category Visual Object Tracking
Liang Peng
Junyuan Gao
Hengrong Du
Weihong Li
Shaohua Dong
Zhipeng Zhang
Heng Fan
Libo Zhang
VLM
306
20
0
06 Mar 2024
Correlation-Embedded Transformer Tracking: A Single-Branch Framework
Correlation-Embedded Transformer Tracking: A Single-Branch FrameworkIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Fei Xie
Wankou Yang
Chunyu Wang
Lei Chu
Yue Cao
Chao Ma
Wenjun Zeng
314
17
0
23 Jan 2024
12
Next