Movies2Scenes: Using Movie Metadata to Learn Scene Representation

22 February 2022

Papers citing "Movies2Scenes: Using Movie Metadata to Learn Scene Representation"

6 / 6 papers shown

Title
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding Shehreen Azad Vibhav Vineet Y. S. Rawat VLM 61 1 0 11 Mar 2025
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning Nikhil Singh Chih-Wei Wu Iroro Orife Mahdi M. Kalayeh 23 2 0 12 Apr 2023
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations Mohammadreza Zolfaghari Yi Zhu Peter V. Gehler Thomas Brox 117 122 0 30 Sep 2021
With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations Debidatta Dwibedi Y. Aytar Jonathan Tompson P. Sermanet Andrew Zisserman SSL 183 450 0 29 Apr 2021
Is Space-Time Attention All You Need for Video Understanding? Gedas Bertasius Heng Wang Lorenzo Torresani ViT 278 1,939 0 09 Feb 2021
Contrastive Representation Learning: A Framework and Review Phúc H. Lê Khắc Graham Healy A. Smeaton SSL AI4TS 149 670 0 10 Oct 2020