Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.14512
Cited By
End-to-End Multimodal Representation Learning for Video Dialog
26 October 2022
Huda AlAmri
Anthony Bilic
Michael Hu
Apoorva Beedu
Irfan Essa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Multimodal Representation Learning for Video Dialog"
4 / 4 papers shown
Title
HierSum: A Global and Local Attention Mechanism for Video Summarization
Apoorva Beedu
Irfan Essa
41
0
0
25 Apr 2025
Mamba Fusion: Learning Actions Through Questioning
Zhikang Dong
Apoorva Beedu
Jason Sheinkopf
Irfan Essa
Mamba
57
2
0
17 Sep 2024
Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering
Jungin Park
Jiyoung Lee
K. Sohn
123
99
0
29 Apr 2021
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,724
0
26 Sep 2016
1