ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11619
20
2

MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval

15 October 2024
Reno Kriz
Kate Sanders
David Etter
Kenton W. Murray
Cameron Carpenter
Kelly Van Ochten
Hannah Recknor
Jimena Guallar-Blasco
Alexander Martin
Ronald Colaianni
Nolan King
Eugene Yang
Benjamin Van Durme
    VGen
ArXivPDFHTML
Abstract

Efficiently retrieving and synthesizing information from large-scale multimodal collections has become a critical challenge. However, existing video retrieval datasets suffer from scope limitations, primarily focusing on matching descriptive but vague queries with small collections of professionally edited, English-centric videos. To address this gap, we introduce MultiVENT 2.0\textbf{MultiVENT 2.0}MultiVENT 2.0, a large-scale, multilingual event-centric video retrieval benchmark featuring a collection of more than 218,000 news videos and 3,906 queries targeting specific world events. These queries specifically target information found in the visual content, audio, embedded text, and text metadata of the videos, requiring systems leverage all these sources to succeed at the task. Preliminary results show that state-of-the-art vision-language models struggle significantly with this task, and while alternative approaches show promise, they are still insufficient to adequately address this problem. These findings underscore the need for more robust multimodal retrieval systems, as effective video retrieval is a crucial step towards multimodal content understanding and generation.

View on arXiv
@article{kriz2025_2410.11619,
  title={ MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval },
  author={ Reno Kriz and Kate Sanders and David Etter and Kenton Murray and Cameron Carpenter and Kelly Van Ochten and Hannah Recknor and Jimena Guallar-Blasco and Alexander Martin and Ronald Colaianni and Nolan King and Eugene Yang and Benjamin Van Durme },
  journal={arXiv preprint arXiv:2410.11619},
  year={ 2025 }
}
Comments on this paper