ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.13860
  4. Cited By
Domain Adaptation of VLM for Soccer Video Understanding
v1v2 (latest)

Domain Adaptation of VLM for Soccer Video Understanding

20 May 2025
Tiancheng Jiang
Henry Wang
Md Sirajus Salekin
Parmida Atighehchian
Shinan Zhang
    VLM
ArXiv (abs)PDFHTML

Papers citing "Domain Adaptation of VLM for Soccer Video Understanding"

30 / 30 papers shown
Title
Multilingual Vision-Language Pre-training for the Remote Sensing Domain
Multilingual Vision-Language Pre-training for the Remote Sensing Domain
João Daniel Silva
João Magalhães
D. Tuia
Bruno Martins
CLIPVLM
74
2
0
30 Oct 2024
MatchTime: Towards Automatic Soccer Game Commentary Generation
MatchTime: Towards Automatic Soccer Game Commentary Generation
Jiayuan Rao
Haoning Wu
Chang-rui Liu
Yanfeng Wang
Weidi Xie
88
8
0
26 Jun 2024
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video
  Understanding
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding
Xinyu Fang
Kangrui Mao
Haodong Duan
Xiangyu Zhao
Yining Li
Dahua Lin
Kai Chen
VLM
110
83
0
20 Jun 2024
An Introduction to Vision-Language Modeling
An Introduction to Vision-Language Modeling
Florian Bordes
Richard Yuanzhe Pang
Anurag Ajay
Alexander C. Li
Adrien Bardes
...
Karen Ullrich
Aishwarya Agrawal
Kate Saenko
Asli Celikyilmaz
Vikas Chandra
VLM
114
95
0
27 May 2024
SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset
SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset
Sushant Gautam
Mehdi Houshmand Sarkhoosh
Jan Held
Cise Midoglu
A. Cioppa
Silvio Giancola
Vajira Thambawita
Michael A. Riegler
Pål Halvorsen
Mubarak Shah
75
6
0
12 May 2024
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with
  Interleaved Visual-Textual Tokens
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens
Kirolos Ataallah
Xiaoqian Shen
Eslam Abdelrahman
Essam Sleiman
Deyao Zhu
Jian Ding
Mohamed Elhoseiny
VLM
99
79
0
04 Apr 2024
SportQA: A Benchmark for Sports Understanding in Large Language Models
SportQA: A Benchmark for Sports Understanding in Large Language Models
Haotian Xia
Zhengbang Yang
Yuqing Wang
Rhys Tracy
Yun Zhao
Dongdong Huang
Zezhi Chen
Yan Zhu
Yuan-fang Wang
Weining Shen
78
10
0
24 Feb 2024
Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports
Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports
Haopeng Li
Andong Deng
Qiuhong Ke
Jun Liu
Hossein Rahmani
Yulan Guo
Mohammed Bennamoun
Chen Chen
184
17
0
03 Jan 2024
VILA: On Pre-training for Visual Language Models
VILA: On Pre-training for Visual Language Models
Ji Lin
Hongxu Yin
Ming-Yu Liu
Yao Lu
Pavlo Molchanov
Andrew Tao
Huizi Mao
Jan Kautz
Mohammad Shoeybi
Song Han
MLLMVLM
128
429
0
12 Dec 2023
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary
  Investigation
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation
Zhongyi Han
Guanglin Zhou
Rundong He
Jindong Wang
Tailin Wu
Yilong Yin
Salman Khan
Lina Yao
Tongliang Liu
Kun Zhang
VLMOOD
56
19
0
12 Dec 2023
VTimeLLM: Empower LLM to Grasp Video Moments
VTimeLLM: Empower LLM to Grasp Video Moments
Bin Huang
Xin Wang
Hong Chen
Zihan Song
Wenwu Zhu
MLLM
154
132
0
30 Nov 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before
  Projection
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLMMLLM
371
711
0
16 Nov 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
  Control
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&RoLRM
232
1,293
0
28 Jul 2023
Med-Flamingo: a Multimodal Medical Few-shot Learner
Med-Flamingo: a Multimodal Medical Few-shot Learner
Michael Moor
Qian Huang
Shirley Wu
Michihiro Yasunaga
C. Zakka
Yashodhara Dalmia
E. Reis
Pranav Rajpurkar
J. Leskovec
LM&MAMedIm
91
272
0
27 Jul 2023
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and
  Language Models
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Muhammad Maaz
H. Rasheed
Salman Khan
Fahad Shahbaz Khan
MLLM
148
661
0
08 Jun 2023
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video
  Understanding
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Hang Zhang
Xin Li
Lidong Bing
MLLM
209
1,067
0
05 Jun 2023
LLaVA-Med: Training a Large Language-and-Vision Assistant for
  Biomedicine in One Day
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
Chunyuan Li
Cliff Wong
Sheng Zhang
Naoto Usuyama
Haotian Liu
Jianwei Yang
Tristan Naumann
Hoifung Poon
Jianfeng Gao
LM&MAMedIm
144
801
0
01 Jun 2023
VARS: Video Assistant Referee System for Automated Soccer Decision
  Making from Multiple Views
VARS: Video Assistant Referee System for Automated Soccer Decision Making from Multiple Views
Jan Held
A. Cioppa
Silvio Giancola
Abdullah Hamdi
Guohao Li
Marc Van Droogenbroeck
78
32
0
10 Apr 2023
SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts
  Commentaries
SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries
Hassan Mkhallati
A. Cioppa
Silvio Giancola
Guohao Li
Marc Van Droogenbroeck
77
34
0
10 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLMMLLM
447
4,668
0
30 Jan 2023
SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in
  Soccer Videos
SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos
A. Cioppa
Silvio Giancola
Adrien Deliège
Le Kang
Xin Zhou
Zhiyu Cheng
Guohao Li
Marc Van Droogenbroeck
90
79
0
14 Apr 2022
Semi-Supervised Training to Improve Player and Ball Detection in Soccer
Semi-Supervised Training to Improve Player and Ball Detection in Soccer
Renaud Vandeghen
A. Cioppa
Marc Van Droogenbroeck
91
27
0
14 Apr 2022
Feature Combination Meets Attention: Baidu Soccer Embeddings and
  Transformer based Temporal Detection
Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection
Xin Zhou
Le Kang
Zhiyu Cheng
Bo He
Jingyu Xin
85
34
0
28 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRLAI4TSAI4CEALMAIMat
678
10,631
0
17 Jun 2021
MERLOT: Multimodal Neural Script Knowledge Models
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers
Ximing Lu
Jack Hessel
Youngjae Yu
J. S. Park
Jize Cao
Ali Farhadi
Yejin Choi
VLMLRM
104
384
0
04 Jun 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
1.1K
30,053
0
26 Feb 2021
SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of
  Broadcast Soccer Videos
SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos
Adrien Deliège
A. Cioppa
Silvio Giancola
M. J. Seikavandi
J. Dueholm
Kamal Nasrollahi
Guohao Li
T. Moeslund
Marc Van Droogenbroeck
93
154
0
26 Nov 2020
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
179
1,972
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSLVLM
305
3,714
0
06 Aug 2019
SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos
SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos
Silvio Giancola
Mohieddine Amine
Tarek Dghaily
Guohao Li
AI4TS
123
198
0
12 Apr 2018
1