Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.09979
Cited By
Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
21 February 2022
Yoshihiro Yamazaki
Shota Orihashi
Ryo Masumura
Mihiro Uchida
Akihiko Takashima
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations"
5 / 5 papers shown
Title
M2K-VDG: Model-Adaptive Multimodal Knowledge Anchor Enhanced Video-grounded Dialogue Generation
Hongcheng Liu
Pingjie Wang
Yu Wang
Yanfeng Wang
39
1
0
19 Feb 2024
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,981
0
09 Feb 2021
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
Hung Le
Doyen Sahoo
Nancy F. Chen
S. Hoi
38
30
0
20 Oct 2020
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
294
10,216
0
16 Nov 2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
1