Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.12919
Cited By
SPOT! Revisiting Video-Language Models for Event Understanding
21 November 2023
Gengyuan Zhang
Jinhe Bi
Jindong Gu
Yanyu Chen
Volker Tresp
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SPOT! Revisiting Video-Language Models for Event Understanding"
4 / 4 papers shown
Title
ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos
Zhou Yu
Lixiang Zheng
Zhou Zhao
A. Fedoseev
Jianping Fan
Kui Ren
Jun Yu
CoGe
27
13
0
04 May 2023
Test of Time: Instilling Video-Language Models with a Sense of Time
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
70
36
0
05 Jan 2023
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
303
771
0
18 Apr 2021
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Long Chen
Xin Yan
Jun Xiao
Hanwang Zhang
Shiliang Pu
Yueting Zhuang
OOD
AAML
132
287
0
14 Mar 2020
1