Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.04208
Cited By
Condensed Movies: Story Based Retrieval with Contextual Embeddings
8 May 2020
Max Bain
Arsha Nagrani
A. Brown
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Condensed Movies: Story Based Retrieval with Contextual Embeddings"
50 / 70 papers shown
Title
STORYANCHORS: Generating Consistent Multi-Scene Story Frames for Long-Form Narratives
Bo Wang
Haoyang Huang
Zhiyin Lu
F. Liu
Guoqing Ma
Jianlong Yuan
Y. Zhang
Nan Duan
VGen
14
0
0
13 May 2025
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Junyu Xie
Tengda Han
Max Bain
Arsha Nagrani
Eshika Khandelwal
Gül Varol
Weidi Xie
Andrew Zisserman
DiffM
VGen
55
0
0
01 Apr 2025
Fair Dynamic Spectrum Access via Fully Decentralized Multi-Agent Reinforcement Learning
Yubo Zhang
Pedro Botelho
Trevor Gordon
Gil Zussman
I. Kadota
50
0
0
31 Mar 2025
V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation
P. Guhan
Tsung-Wei Huang
Guan-Ming Su
Subhadra Gopalakrishnan
Dinesh Manocha
VGen
53
0
0
14 Jan 2025
Personalized Video Summarization by Multimodal Video Understanding
Brian Chen
Xiangyuan Zhao
Yingnan Zhu
34
1
0
05 Nov 2024
Pseudo Dataset Generation for Out-of-Domain Multi-Camera View Recommendation
Kuan-Ying Lee
Qian Zhou
K. Nahrstedt
33
0
0
17 Oct 2024
LocoMotion: Learning Motion-Focused Video-Language Representations
Hazel Doughty
Fida Mohammad Thoker
Cees G. M. Snoek
33
2
0
15 Oct 2024
It's Just Another Day: Unique Video Captioning by Discriminative Prompting
Toby Perrett
Tengda Han
Dima Damen
Andrew Zisserman
19
3
0
15 Oct 2024
Movie Trailer Genre Classification Using Multimodal Pretrained Features
Serkan Sulun
Paula Viana
M. Davies
CLIP
16
2
0
11 Oct 2024
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation
Dingyi Yang
Qin Jin
36
5
0
26 Aug 2024
SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses
Chaolei Tan
Zihang Lin
Junfu Pu
Zhongang Qi
Wei-Yi Pei
Zhi Qu
Yexin Wang
Ying Shan
Wei-Shi Zheng
Jianfang Hu
AI4TS
31
0
0
03 Aug 2024
Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names
Ragav Sachdeva
Gyungin Shin
Andrew Zisserman
22
4
0
01 Aug 2024
Learning Video Context as Interleaved Multimodal Sequences
S. Shao
Pengchuan Zhang
Y. Li
Xide Xia
A. Meso
Ziteng Gao
Jinheng Xie
N. Holliman
Mike Zheng Shou
41
5
0
31 Jul 2024
AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description
Junyu Xie
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
25
8
0
22 Jul 2024
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos
Kirolos Ataallah
Xiaoqian Shen
Eslam Abdelrahman
Essam Sleiman
Mingchen Zhuge
Jian Ding
Deyao Zhu
Jürgen Schmidhuber
Mohamed Elhoseiny
VLM
17
17
0
17 Jul 2024
E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness
Robin Courant
Nicolas Dufour
Xi Wang
Marc Christie
Vicky Kalogeiton
VGen
36
4
0
01 Jul 2024
Multilingual Synopses of Movie Narratives: A Dataset for Story Understanding
Yidan Sun
Jianfei Yu
Boyang Li
43
0
0
18 Jun 2024
A Survey of Video Datasets for Grounded Event Understanding
Kate Sanders
Benjamin Van Durme
32
4
0
14 Jun 2024
"Previously on ..." From Recaps to Story Summarization
Aditya Kumar Singh
Dhruv Srivastava
Makarand Tapaswi
40
0
0
19 May 2024
CinePile: A Long Video Question Answering Dataset and Benchmark
Ruchit Rawal
Khalid Saifullah
Ronen Basri
David Jacobs
Gowthami Somepalli
Tom Goldstein
38
39
0
14 May 2024
AutoAD III: The Prequel -- Back to the Pixels
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
DiffM
36
20
0
22 Apr 2024
Movie101v2: Improved Movie Narration Benchmark
Zihao Yue
Yepeng Zhang
Ziheng Wang
Qin Jin
VGen
22
1
0
20 Apr 2024
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens
Kirolos Ataallah
Xiaoqian Shen
Eslam Abdelrahman
Essam Sleiman
Deyao Zhu
Jian Ding
Mohamed Elhoseiny
VLM
39
66
0
04 Apr 2024
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning
Amir Ziai
Aneesh Vartakavi
VLM
VGen
25
0
0
09 Feb 2024
Visual Objectification in Films: Towards a New AI Task for Video Interpretation
Julie Tores
L. Sassatelli
Hui-Yin Wu
Clement Bergman
Lea Andolfi
...
F. Precioso
Thierry Devars
Magali Guaresi
Virginie Julliard
Sarah Lecossais
25
2
0
24 Jan 2024
Video Summarization: Towards Entity-Aware Captions
Hammad A. Ayyubi
Tianqi Liu
Arsha Nagrani
Xudong Lin
Mingda Zhang
Anurag Arnab
Feng Han
Yukun Zhu
Jialu Liu
Shih-Fu Chang
26
0
0
01 Dec 2023
A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
M. Gwilliam
Michael Cogswell
Meng Ye
Karan Sikka
Abhinav Shrivastava
Ajay Divakaran
3DV
10
1
1
30 Nov 2023
Sound of Story: Multi-modal Storytelling with Audio
Jaeyeon Bae
Seokhoon Jeong
Seokun Kang
Namgi Han
Jae-Yon Lee
Hyounghun Kim
Taehwan Kim
21
2
0
30 Oct 2023
TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Shuhuai Ren
Sishuo Chen
Shicheng Li
Xu Sun
Lu Hou
ViT
29
28
0
29 Oct 2023
Incorporating Domain Knowledge Graph into Multimodal Movie Genre Classification with Self-Supervised Attention and Contrastive Learning
Jiaqi Li
Guilin Qi
Chuanyi Zhang
Yongrui Chen
Yiming Tan
Chenlong Xia
Ye Tian
25
3
0
12 Oct 2023
AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
DiffM
19
36
0
10 Oct 2023
Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding
Mohamed Afham
Satya Narayan Shukla
Omid Poursaeed
Pengchuan Zhang
Ashish Shah
Sernam Lim
VLM
24
2
0
20 Sep 2023
LanSER: Language-Model Supported Speech Emotion Recognition
Taesik Gong
Joshua Belanich
Krishna Somandepalli
Arsha Nagrani
B. Eoff
Brendan Jou
25
10
0
07 Sep 2023
MM-AU:Towards Multimodal Understanding of Advertisement Videos
Digbalay Bose
Rajat Hebbar
Tiantian Feng
Krishna Somandepalli
Anfeng Xu
Shrikanth Narayanan
25
5
0
27 Aug 2023
Long-range Multimodal Pretraining for Movie Understanding
Dawit Mureja Argaw
Joon-Young Lee
Markus Woodson
In So Kweon
Fabian Caba Heilbron
VLM
25
7
0
18 Aug 2023
PTVD: A Large-Scale Plot-Oriented Multimodal Dataset Based on Television Dramas
Chen Li
Xutan Peng
Teng Wang
Yixiao Ge
Mengyang Liu
Xuyuan Xu
Yexin Wang
Ying Shan
VGen
13
2
0
26 Jun 2023
How you feelin'? Learning Emotions and Mental States in Movie Scenes
D. Srivastava
A. Singh
Makarand Tapaswi
19
10
0
12 Apr 2023
AutoAD: Movie Description in Context
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
16
34
0
29 Mar 2023
Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
Bei Gan
Xiujun Shu
Ruizhi Qiao
Haoqian Wu
Keyun Chen
Hanjun Li
Bohan Ren
26
5
0
26 Mar 2023
Building Scalable Video Understanding Benchmarks through Sports
Aniket Agarwal
Alex Zhang
Karthik Narasimhan
Igor Gilitschenski
Vishvak Murahari
Yash Kant
19
1
0
17 Jan 2023
TeViS:Translating Text Synopses to Video Storyboards
Xu Gu
Yuchong Sun
Feiyue Ni
Shizhe Chen
Xihua Wang
Ruihua Song
B. Li
Xiang Cao
DiffM
23
4
0
31 Dec 2022
Weakly-Supervised Temporal Article Grounding
Long Chen
Yulei Niu
Brian Chen
Xudong Lin
G. Han
Christopher Thomas
Hammad A. Ayyubi
Heng Ji
Shih-Fu Chang
AI4TS
19
13
0
22 Oct 2022
MovieCLIP: Visual Scene Recognition in Movies
Digbalay Bose
Rajat Hebbar
Krishna Somandepalli
Haoyang Zhang
Yin Cui
K. Cole-McLaughlin
H. Wang
Shrikanth Narayanan
CLIP
6
20
0
20 Oct 2022
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Yuchong Sun
Hongwei Xue
Ruihua Song
Bei Liu
Huan Yang
Jianlong Fu
AI4TS
VLM
16
68
0
12 Oct 2022
Match Cutting: Finding Cuts with Smooth Visual Transitions
Boris Chen
Amir Ziai
Rebecca Tucker
Yuchen Xie
VGen
23
14
0
11 Oct 2022
Multi-modal Video Chapter Generation
Xiao Cao
Zitan Chen
Canyu Le
Lei Meng
VGen
14
3
0
26 Sep 2022
Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward
Yunlong Tang
Siting Xu
Teng Wang
Qin Lin
Qinglin Lu
Feng Zheng
VOS
60
10
0
25 Sep 2022
Self-Contained Entity Discovery from Captioned Videos
M. Ayoughi
P. Mettes
Paul T. Groth
20
2
0
13 Aug 2022
The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing
Dawit Mureja Argaw
Fabian Caba Heilbron
Joon-Young Lee
Markus Woodson
In So Kweon
VGen
37
22
0
20 Jul 2022
A CLIP-Hitchhiker's Guide to Long Video Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
CLIP
115
61
0
17 May 2022
1
2
Next