ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.21692
54
0

Enhancing Self-Supervised Fine-Grained Video Object Tracking with Dynamic Memory Prediction

30 April 2025
Zihan Zhou
Changrui Dai
Aibo Song
Xiaolin Fang
    VOS
ArXivPDFHTML
Abstract

Successful video analysis relies on accurate recognition of pixels across frames, and frame reconstruction methods based on video correspondence learning are popular due to their efficiency. Existing frame reconstruction methods, while efficient, neglect the value of direct involvement of multiple reference frames for reconstruction and decision-making aspects, especially in complex situations such as occlusion or fast movement. In this paper, we introduce a Dynamic Memory Prediction (DMP) framework that innovatively utilizes multiple reference frames to concisely and directly enhance frame reconstruction. Its core component is a Reference Frame Memory Engine that dynamically selects frames based on object pixel features to improve tracking accuracy. In addition, a Bidirectional Target Prediction Network is built to utilize multiple reference frames to improve the robustness of the model. Through experiments, our algorithm outperforms the state-of-the-art self-supervised techniques on two fine-grained video object tracking tasks: object segmentation and keypoint tracking.

View on arXiv
@article{zhou2025_2504.21692,
  title={ Enhancing Self-Supervised Fine-Grained Video Object Tracking with Dynamic Memory Prediction },
  author={ Zihan Zhou and Changrui Dai and Aibo Song and Xiaolin Fang },
  journal={arXiv preprint arXiv:2504.21692},
  year={ 2025 }
}
Comments on this paper