Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.03153
Cited By
MultiVENT: Multilingual Videos of Events with Aligned Natural Text
6 July 2023
Kate Sanders
David Etter
Reno Kriz
Benjamin Van Durme
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MultiVENT: Multilingual Videos of Events with Aligned Natural Text"
12 / 12 papers shown
Title
Bonsai: Interpretable Tree-Adaptive Grounded Reasoning
Kate Sanders
Benjamin Van Durme
LRM
34
1
0
04 Apr 2025
WikiVideo: Article Generation from Multiple Videos
Alexander Martin
Reno Kriz
William Walden
Kate Sanders
Hannah Recknor
Eugene Yang
Francis Ferraro
Benjamin Van Durme
DiffM
VGen
42
1
0
01 Apr 2025
MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion
Saron Samuel
Dan DeGenaro
Jimena Guallar-Blasco
Kate Sanders
Oluwaseun Eisape
...
David Etter
Efsun Kayi
Matthew Wiesner
Kenton W. Murray
Reno Kriz
83
0
0
26 Mar 2025
MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval
Reno Kriz
Kate Sanders
David Etter
Kenton W. Murray
Cameron Carpenter
...
Alexander Martin
Ronald Colaianni
Nolan King
Eugene Yang
Benjamin Van Durme
VGen
22
2
0
15 Oct 2024
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
Haojun Shi
Suyu Ye
Xinyu Fang
Chuanyang Jin
Leyla Isik
Yen-Ling Kuo
Tianmin Shu
LLMAG
48
7
0
22 Aug 2024
A Survey of Video Datasets for Grounded Event Understanding
Kate Sanders
Benjamin Van Durme
29
4
0
14 Jun 2024
MMToM-QA: Multimodal Theory of Mind Question Answering
Chuanyang Jin
Yutong Wu
Jing Cao
Jiannan Xiang
Yen-Ling Kuo
Zhiting Hu
T. Ullman
Antonio Torralba
Joshua B. Tenenbaum
Tianmin Shu
25
32
0
16 Jan 2024
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Andrew Rouditchenko
Yung-Sung Chuang
Nina Shvetsova
Samuel Thomas
Rogerio Feris
Brian Kingsbury
Leonid Karlinsky
David F. Harwath
Hilde Kuehne
James R. Glass
VLM
18
4
0
07 Oct 2022
Ambiguous Images With Human Judgments for Robust Visual Event Classification
Kate Sanders
Reno Kriz
Anqi Liu
Benjamin Van Durme
55
12
0
06 Oct 2022
A CLIP-Hitchhiker's Guide to Long Video Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
CLIP
113
60
0
17 May 2022
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Mohit Bansal
106
268
0
24 Jan 2020
1