Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.04021
Cited By
Leveraging Video Descriptions to Learn Video Question Answering
12 November 2016
Kuo-Hao Zeng
Tseng-Hung Chen
Ching-Yao Chuang
Yuan-Hong Liao
Juan Carlos Niebles
Min Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Leveraging Video Descriptions to Learn Video Question Answering"
20 / 20 papers shown
Title
Natural Language Generation from Visual Sequences: Challenges and Future Directions
Aditya K Surikuchi
Raquel Fernández
Sandro Pezzelle
EGVM
191
0
0
18 Feb 2025
Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QA
Jongwoo Park
Kanchana Ranasinghe
Kumara Kahatapitiya
Wonjeong Ryoo
Donghyun Kim
Michael S. Ryoo
62
20
0
13 Jun 2024
Connecting Vision and Language with Video Localized Narratives
P. Voigtlaender
Soravit Changpinyo
Jordi Pont-Tuset
Radu Soricut
V. Ferrari
VGen
41
21
0
22 Feb 2023
M^4I: Multi-modal Models Membership Inference
Pingyi Hu
Zihan Wang
Ruoxi Sun
Hu Wang
Minhui Xue
37
26
0
15 Sep 2022
WildQA: In-the-Wild Video Question Answering
Santiago Castro
Naihao Deng
Pingxuan Huang
Mihai Burzo
Rada Mihalcea
70
7
0
14 Sep 2022
Equivariant and Invariant Grounding for Video Question Answering
Yicong Li
Xiang Wang
Junbin Xiao
Tat-Seng Chua
18
25
0
26 Jul 2022
Learning to Retrieve Videos by Asking Questions
Avinash Madasu
Junier Oliva
Gedas Bertasius
VGen
30
16
0
11 May 2022
Learning to Answer Visual Questions from Web Videos
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
34
33
0
10 May 2022
Video Question Answering: Datasets, Algorithms and Challenges
Yaoyao Zhong
Junbin Xiao
Wei Ji
Yicong Li
Wei Deng
Tat-Seng Chua
18
85
0
02 Mar 2022
NEWSKVQA: Knowledge-Aware News Video Question Answering
Pranay Gupta
Manish Gupta
22
7
0
08 Feb 2022
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering
Junbin Xiao
Angela Yao
Zhiyuan Liu
Yicong Li
Wei Ji
Tat-Seng Chua
30
111
0
12 Dec 2021
Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices
Hariom A. Pandya
Brijesh S. Bhatt
38
27
0
07 Dec 2021
Pano-AVQA: Grounded Audio-Visual Question Answering on 360
∘
^\circ
∘
Videos
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
25
78
0
11 Oct 2021
On Semantic Similarity in Video Retrieval
Michael Wray
Hazel Doughty
Dima Damen
21
66
0
18 Mar 2021
Future semantic segmentation of time-lapsed videos with large temporal displacement
Talha Ahmad Siddiqui
Samarth Bharadwaj
16
0
0
27 Dec 2018
Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network
Kai Wang
Yu Liu
Xiujuan Xu
Dan Lin
GNN
13
18
0
11 Aug 2018
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
13
2,856
0
26 May 2017
Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization
Kuo-Hao Zeng
Shih-Han Chou
Fu-Hsiang Chan
Juan Carlos Niebles
Min Sun
24
58
0
18 May 2017
The Forgettable-Watcher Model for Video Question Answering
Hongyang Xue
Zhou Zhao
Deng Cai
19
9
0
03 May 2017
Towards Building Large Scale Multimodal Domain-Aware Conversation Systems
Amrita Saha
Mitesh Khapra
Karthik Sankaranarayanan
21
8
0
01 Apr 2017
1