ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.13860
85
0
v1v2 (latest)

Domain Adaptation of VLM for Soccer Video Understanding

20 May 2025
Tiancheng Jiang
Henry Wang
Md Sirajus Salekin
Parmida Atighehchian
Shinan Zhang
    VLM
ArXiv (abs)PDFHTML
Main:8 Pages
6 Figures
Bibliography:3 Pages
3 Tables
Abstract

Vision Language Models (VLMs) have demonstrated strong performance in multi-modal tasks by effectively aligning visual and textual representations. However, most video understanding VLM research has been domain-agnostic, leaving the understanding of their transfer learning capability to specialized domains under-explored. In this work, we address this by exploring the adaptability of open-source VLMs to specific domains, and focusing on soccer as an initial case study. Our approach uses large-scale soccer datasets and LLM to create instruction-following data, and use them to iteratively fine-tune the general-domain VLM in a curriculum learning fashion (first teaching the model key soccer concepts to then question answering tasks). The final adapted model, trained using a curated dataset of 20k video clips, exhibits significant improvement in soccer-specific tasks compared to the base model, with a 37.5% relative improvement for the visual question-answering task and an accuracy improvement from 11.8% to 63.5% for the downstream soccer action classification task.

View on arXiv
@article{jiang2025_2505.13860,
  title={ Domain Adaptation of VLM for Soccer Video Understanding },
  author={ Tiancheng Jiang and Henry Wang and Md Sirajus Salekin and Parmida Atighehchian and Shinan Zhang },
  journal={arXiv preprint arXiv:2505.13860},
  year={ 2025 }
}
Comments on this paper