ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1512.02902
  4. Cited By
MovieQA: Understanding Stories in Movies through Question-Answering

MovieQA: Understanding Stories in Movies through Question-Answering

9 December 2015
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
ArXivPDFHTML

Papers citing "MovieQA: Understanding Stories in Movies through Question-Answering"

50 / 166 papers shown
Title
What is More Likely to Happen Next? Video-and-Language Future Event
  Prediction
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
22
72
0
15 Oct 2020
Visual Question Answering on Image Sets
Visual Question Answering on Image Sets
Ankan Bansal
Yuting Zhang
Rama Chellappa
CoGe
8
40
0
27 Aug 2020
AiR: Attention with Reasoning Capability
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
13
36
0
28 Jul 2020
Dense-Caption Matching and Frame-Selection Gating for Temporal
  Localization in VideoQA
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
Hyounghun Kim
Zineng Tang
Joey Tianyi Zhou
22
31
0
13 May 2020
Condensed Movies: Story Based Retrieval with Contextual Embeddings
Condensed Movies: Story Based Retrieval with Contextual Embeddings
Max Bain
Arsha Nagrani
A. Brown
Andrew Zisserman
39
100
0
08 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
41
493
0
01 May 2020
Learning Interactions and Relationships between Movie Characters
Learning Interactions and Relationships between Movie Characters
Anna Kukleva
Makarand Tapaswi
Ivan Laptev
41
51
0
29 Mar 2020
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video
  Captioning
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
15
60
0
11 Mar 2020
A$^3$: Accelerating Attention Mechanisms in Neural Networks with
  Approximation
A3^33: Accelerating Attention Mechanisms in Neural Networks with Approximation
Tae Jun Ham
Sungjun Jung
Seonghak Kim
Young H. Oh
Yeonhong Park
...
Jung-Hun Park
Sanghee Lee
Kyoung Park
Jae W. Lee
D. Jeong
22
211
0
22 Feb 2020
Text-based Question Answering from Information Retrieval and Deep Neural
  Network Perspectives: A Survey
Text-based Question Answering from Information Retrieval and Deep Neural Network Perspectives: A Survey
Zahra Abbasiyantaeb
S. Momtazi
RALM
22
69
0
16 Feb 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
119
275
0
24 Jan 2020
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
34
9
0
31 Oct 2019
A Graph-Based Framework to Bridge Movies and Synopses
A Graph-Based Framework to Bridge Movies and Synopses
Yu Xiong
Chengyi Zhang
Lingfeng Guo
Hang Zhou
Bolei Zhou
Dahua Lin
27
60
0
24 Oct 2019
KnowIT VQA: Answering Knowledge-Based Questions about Videos
KnowIT VQA: Answering Knowledge-Based Questions about Videos
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
22
77
0
23 Oct 2019
CATER: A diagnostic dataset for Compositional Actions and TEmporal
  Reasoning
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Rohit Girdhar
Deva Ramanan
19
176
0
10 Oct 2019
A Better Way to Attend: Attention with Trees for Video Question
  Answering
A Better Way to Attend: Attention with Trees for Video Question Answering
Hongyang Xue
Wenqing Chu
Zhou Zhao
Deng Cai
22
33
0
05 Sep 2019
VideoNavQA: Bridging the Gap between Visual and Embodied Question
  Answering
VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering
Cătălina Cangea
Eugene Belilovsky
Pietro Lió
Aaron Courville
16
16
0
14 Aug 2019
Video Face Clustering with Unknown Number of Clusters
Video Face Clustering with Unknown Number of Clusters
Makarand Tapaswi
M. Law
Sanja Fidler
CVBM
21
60
0
09 Aug 2019
Moviescope: Large-scale Analysis of Movies using Multiple Modalities
Moviescope: Large-scale Analysis of Movies using Multiple Modalities
Paola Cascante-Bonilla
Kalpathy Sitaraman
Mengjia Luo
Vicente Ordonez
22
39
0
08 Aug 2019
CraftAssist: A Framework for Dialogue-enabled Interactive Agents
CraftAssist: A Framework for Dialogue-enabled Interactive Agents
Jonathan Gray
Kavya Srinet
Yacine Jernite
Haonan Yu
Zhuoyuan Chen
Demi Guo
Siddharth Goyal
C. L. Zitnick
Arthur Szlam
28
38
0
19 Jul 2019
Learning Representations from Imperfect Time Series Data via Tensor Rank
  Regularization
Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization
Paul Pu Liang
Zhun Liu
Yao-Hung Hubert Tsai
Qibin Zhao
Ruslan Salakhutdinov
Louis-Philippe Morency
AI4TS
30
81
0
01 Jul 2019
Open-Ended Long-Form Video Question Answering via Hierarchical
  Convolutional Self-Attention Networks
Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks
Zhu Zhang
Zhou Zhao
Zhijie Lin
Jingkuan Song
Xiaofei He
BDL
19
14
0
28 Jun 2019
Adversarial Multimodal Network for Movie Question Answering
Zhaoquan Yuan
Siyuan Sun
Lixin Duan
Xiao Wu
Changsheng Xu
24
3
0
24 Jun 2019
Terminology-based Text Embedding for Computing Document Similarities on
  Technical Content
Terminology-based Text Embedding for Computing Document Similarities on Technical Content
Hamid Mirisaee
Éric Gaussier
Cédric Lagnier
Agnès Guerraz
13
3
0
05 Jun 2019
Scene Text Visual Question Answering
Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Ernest Valveny
C. V. Jawahar
Dimosthenis Karatzas
30
343
0
31 May 2019
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language
  Feedback
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
Hui Wu
Yupeng Gao
Xiaoxiao Guo
Ziad Al-Halah
Steven J. Rennie
Kristen Grauman
Rogerio Feris
EgoV
20
63
0
30 May 2019
Towards Efficient Model Compression via Learned Global Ranking
Towards Efficient Model Compression via Learned Global Ranking
Ting-Wu Chin
Ruizhou Ding
Cha Zhang
Diana Marculescu
16
170
0
28 Apr 2019
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the
  Limbo of Resources
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Yanghua Peng
Hang Zhang
Yifei Ma
Tong He
Zhi-Li Zhang
Sheng Zha
Mu Li
17
23
0
26 Apr 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
31
227
0
25 Apr 2019
Constructing Hierarchical Q&A Datasets for Video Story Understanding
Constructing Hierarchical Q&A Datasets for Video Story Understanding
Y. Heo
Kyoung-Woon On
Seong-Ho Choi
Jaeseo Lim
Jinah Kim
Jeh-Kwang Ryu
Byung-Chull Bae
Byoung-Tak Zhang
23
5
0
01 Apr 2019
Episodic Memory Reader: Learning What to Remember for Question Answering
  from Streaming Data
Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data
Moonsu Han
Minki Kang
Hyunwoo Jung
Sung Ju Hwang
RALM
19
19
0
14 Mar 2019
Self-Supervised Learning of Face Representations for Video Face
  Clustering
Self-Supervised Learning of Face Representations for Video Face Clustering
Vivek Sharma
Makarand Tapaswi
M. Sarfraz
Rainer Stiefelhagen
SSL
CVBM
11
49
0
03 Mar 2019
Audio-Visual Scene-Aware Dialog
Audio-Visual Scene-Aware Dialog
Huda AlAmri
Vincent Cartillier
Abhishek Das
Jue Wang
A. Cherian
...
Tim K. Marks
Chiori Hori
Peter Anderson
Stefan Lee
Devi Parikh
VGen
23
189
0
25 Jan 2019
Adversarial Attacks on Deep Learning Models in Natural Language
  Processing: A Survey
Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey
W. Zhang
Quan Z. Sheng
A. Alhazmi
Chenliang Li
AAML
24
57
0
21 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
51
322
0
20 Jan 2019
From FiLM to Video: Multi-turn Question Answering with Multi-modal
  Context
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context
T. Nguyen
Shikhar Sharma
Hannes Schulz
Layla El Asri
15
33
0
17 Dec 2018
Traversing the Continuous Spectrum of Image Retrieval with Deep Dynamic
  Models
Traversing the Continuous Spectrum of Image Retrieval with Deep Dynamic Models
Ziad Al-Halah
Andreas M. Lehrmann
Leonid Sigal
16
0
0
01 Dec 2018
From Recognition to Cognition: Visual Commonsense Reasoning
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
27
865
0
27 Nov 2018
Improving Machine Reading Comprehension with General Reading Strategies
Improving Machine Reading Comprehension with General Reading Strategies
Kai Sun
Dian Yu
Dong Yu
Claire Cardie
AI4CE
21
116
0
31 Oct 2018
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
Shubham Agarwal
Ondrej Dusek
Ioannis Konstas
Verena Rieser
26
22
0
20 Oct 2018
TVQA: Localized, Compositional Video Question Answering
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
34
616
0
05 Sep 2018
Comparing Attention-based Convolutional and Recurrent Neural Networks:
  Success and Limitations in Machine Reading Comprehension
Comparing Attention-based Convolutional and Recurrent Neural Networks: Success and Limitations in Machine Reading Comprehension
Matthias Blohm
Glorianna Jagfeld
Ekta Sood
Xiang Yu
Ngoc Thang Vu
19
54
0
27 Aug 2018
ODSQA: Open-domain Spoken Question Answering Dataset
ODSQA: Open-domain Spoken Question Answering Dataset
Chia-Hsuan Lee
Shang-Ming Wang
Huan-Cheng Chang
Hung-yi Lee
RALM
22
52
0
07 Aug 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal
  Attention-Based Video Features
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Chiori Hori
Huda AlAmri
Jue Wang
G. Wichern
Takaaki Hori
...
Raphael Gontijo-Lopes
Abhishek Das
Irfan Essa
Dhruv Batra
Devi Parikh
VGen
18
125
0
21 Jun 2018
From Trailers to Storylines: An Efficient Way to Learn from Movies
From Trailers to Storylines: An Efficient Way to Learn from Movies
Qingqiu Huang
Yuanjun Xiong
Yu Xiong
Yuqi Zhang
Dahua Lin
28
26
0
14 Jun 2018
Scaling Egocentric Vision: The EPIC-KITCHENS Dataset
Scaling Egocentric Vision: The EPIC-KITCHENS Dataset
Dima Damen
Hazel Doughty
G. Farinella
Sanja Fidler
Antonino Furnari
...
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
25
996
0
08 Apr 2018
Motion-Appearance Co-Memory Networks for Video Question Answering
Motion-Appearance Co-Memory Networks for Video Question Answering
J. Gao
Runzhou Ge
Kan Chen
Ram Nevatia
38
240
0
29 Mar 2018
Weakly-Supervised Action Segmentation with Iterative Soft Boundary
  Assignment
Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment
Li Ding
Chenliang Xu
22
180
0
28 Mar 2018
Audio-Visual Event Localization in Unconstrained Videos
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
31
425
0
23 Mar 2018
MovieGraphs: Towards Understanding Human-Centric Situations from Videos
MovieGraphs: Towards Understanding Human-Centric Situations from Videos
Paul Vicol
Makarand Tapaswi
Lluis Castrejon
Sanja Fidler
31
136
0
19 Dec 2017
Previous
1234
Next