F $^3$ Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos

11 April 2025

Abstract

Analyzing Fast, Frequent, and Fine-grained (F $^3$ ) events presents a significant challenge in video analytics and multi-modal LLMs. Current methods struggle to identify events that satisfy all the F $^3$ criteria with high accuracy due to challenges such as motion blur and subtle visual discrepancies. To advance research in video understanding, we introduce F $^3$ Set, a benchmark that consists of video datasets for precise F $^3$ event detection. Datasets in F $^3$ Set are characterized by their extensive scale and comprehensive detail, usually encompassing over 1,000 event types with precise timestamps and supporting multi-level granularity. Currently, F $^3$ Set contains several sports datasets, and this framework may be extended to other applications as well. We evaluated popular temporal action understanding methods on F $^3$ Set, revealing substantial challenges for existing techniques. Additionally, we propose a new method, F $^3$ ED, for F $^3$ event detections, achieving superior performance. The dataset, model, and benchmark code are available atthis https URL.

View on arXiv

@article{liu2025_2504.08222,
  title={ F$^3$Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos },
  author={ Zhaoyu Liu and Kan Jiang and Murong Ma and Zhe Hou and Yun Lin and Jin Song Dong },
  journal={arXiv preprint arXiv:2504.08222},
  year={ 2025 }
}

Comments on this paper

F3^33Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos

F $^3$ Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos