ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.07735
  4. Cited By
iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video
  Captioning and Video Question Answering

iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering

16 November 2020
Aman Chadha
Gurneet Arora
Navpreet Kaloty
ArXivPDFHTML

Papers citing "iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering"

21 / 21 papers shown
Title
FocusedAD: Character-centric Movie Audio Description
FocusedAD: Character-centric Movie Audio Description
Xiaojun Ye
C. Wang
Yiren Song
Sheng Zhou
Liangcheng Li
Jiajun Bu
VGen
53
0
0
16 Apr 2025
StoryNavi: On-Demand Narrative-Driven Reconstruction of Video Play With
  Generative AI
StoryNavi: On-Demand Narrative-Driven Reconstruction of Video Play With Generative AI
Alston Lantian Xu
Tianwei Ma
Tianmeng Liu
Can Liu
Alvaro Cassinelli
VGen
34
0
0
04 Oct 2024
AutoAD III: The Prequel -- Back to the Pixels
AutoAD III: The Prequel -- Back to the Pixels
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
DiffM
36
20
0
22 Apr 2024
AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
DiffM
19
36
0
10 Oct 2023
Towards Debiasing Frame Length Bias in Text-Video Retrieval via Causal
  Intervention
Towards Debiasing Frame Length Bias in Text-Video Retrieval via Causal Intervention
Burak Satar
Huaiyu Zhu
Hanwang Zhang
Joo-Hwee Lim
CML
30
0
0
17 Sep 2023
Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
Guangyi Chen
Xiao Liu
Guangrun Wang
Kun Zhang
Philip H.S.Torr
Xiaoping Zhang
Yansong Tang
19
18
0
16 Aug 2023
A Review of Deep Learning for Video Captioning
A Review of Deep Learning for Video Captioning
Moloud Abdar
Meenakshi Kollati
Swaraja Kuraparthi
Farhad Pourpanah
Daniel J. McDuff
...
Shuicheng Yan
Abduallah A. Mohamed
Abbas Khosravi
Erik Cambria
Fatih Porikli
3DV
27
20
0
22 Apr 2023
AutoAD: Movie Description in Context
AutoAD: Movie Description in Context
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
16
34
0
29 Mar 2023
Implicit and Explicit Commonsense for Multi-sentence Video Captioning
Implicit and Explicit Commonsense for Multi-sentence Video Captioning
Shih-Han Chou
James J. Little
Leonid Sigal
21
2
0
14 Mar 2023
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense
  Video Captioning
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang
Arsha Nagrani
Paul Hongsuck Seo
Antoine Miech
Jordi Pont-Tuset
Ivan Laptev
Josef Sivic
Cordelia Schmid
AI4TS
VLM
23
220
0
27 Feb 2023
Video Question Answering with Iterative Video-Text Co-Tokenization
Video Question Answering with Iterative Video-Text Co-Tokenization
A. Piergiovanni
K. Morton
Weicheng Kuo
Michael S. Ryoo
A. Angelova
16
17
0
01 Aug 2022
Zero-Shot Video Question Answering via Frozen Bidirectional Language
  Models
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
34
226
0
16 Jun 2022
Learning to Answer Visual Questions from Web Videos
Learning to Answer Visual Questions from Web Videos
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
28
33
0
10 May 2022
AssistQ: Affordance-centric Question-driven Task Completion for
  Egocentric Assistant
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant
B. Wong
Joya Chen
You Wu
Stan Weixian Lei
Dongxing Mao
Difei Gao
Mike Zheng Shou
EgoV
27
27
0
08 Mar 2022
Video Question Answering: Datasets, Algorithms and Challenges
Video Question Answering: Datasets, Algorithms and Challenges
Yaoyao Zhong
Junbin Xiao
Wei Ji
Yicong Li
Wei Deng
Tat-Seng Chua
16
84
0
02 Mar 2022
Bridging Video-text Retrieval with Multiple Choice Questions
Bridging Video-text Retrieval with Multiple Choice Questions
Yuying Ge
Yixiao Ge
Xihui Liu
Dian Li
Ying Shan
Xiaohu Qie
Ping Luo
BDL
16
108
0
13 Jan 2022
Dense Video Captioning Using Unsupervised Semantic Information
Dense Video Captioning Using Unsupervised Semantic Information
Valter Estevam
Rayson Laroca
Hélio Pedrini
David Menotti
6
9
0
15 Dec 2021
Transferring Domain-Agnostic Knowledge in Video Question Answering
Transferring Domain-Agnostic Knowledge in Video Question Answering
Tianran Wu
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
Haruo Takemura
10
8
0
26 Oct 2021
iReason: Multimodal Commonsense Reasoning using Videos and Natural
  Language with Interpretability
iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability
Andrew Wang
Aman Chadha
CML
11
5
0
25 Jun 2021
On the hidden treasure of dialog in video question answering
On the hidden treasure of dialog in video question answering
Deniz Engin
Franccois Schnitzler
Ngoc Q. K. Duong
Yannis Avrithis
13
10
0
26 Mar 2021
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Haozheng Luo
Ruiyang Qin
Chenwei Xu
Guo Ye
Zening Luo
43
4
0
01 Dec 2020
1