ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.13710
35
0

Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching

18 April 2025
Heng Liu
Guanghui Li
Mingqi Gao
Xiantong Zhen
Feng Zheng
Y. Wang
    VOS
ArXivPDFHTML
Abstract

Referring video object segmentation (RVOS) aims to segment objects in videos guided by natural language descriptions. We propose FS-RVOS, a Transformer-based model with two key components: a cross-modal affinity module and an instance sequence matching strategy, which extends FS-RVOS to multi-object segmentation (FS-RVMOS). Experiments show FS-RVOS and FS-RVMOS outperform state-of-the-art methods across diverse benchmarks, demonstrating superior robustness and accuracy.

View on arXiv
@article{liu2025_2504.13710,
  title={ Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching },
  author={ Heng Liu and Guanghui Li and Mingqi Gao and Xiantong Zhen and Feng Zheng and Yang Wang },
  journal={arXiv preprint arXiv:2504.13710},
  year={ 2025 }
}
Comments on this paper