Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.13942
Cited By
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
31 March 2020
Boxiao Pan
Haoye Cai
De-An Huang
Kuan-Hui Lee
Adrien Gaidon
Ehsan Adeli
Juan Carlos Niebles
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"
24 / 24 papers shown
Title
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Mohit Bansal
Gedas Bertasius
David J. Crandall
109
1
0
12 Dec 2024
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Asmar Nadeem
Faegheh Sardari
R. Dawes
Syed Sameed Husain
Adrian Hilton
Armin Mustafa
47
4
0
10 Jun 2024
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
Yifei Huang
Guo Chen
Jilan Xu
Mingfang Zhang
Lijin Yang
...
Hongjie Zhang
Lu Dong
Yali Wang
Limin Wang
Yu Qiao
EgoV
54
35
0
24 Mar 2024
A Modular Approach for Multimodal Summarization of TV Shows
Louis Mahon
Mirella Lapata
21
9
0
06 Mar 2024
SnapCap: Efficient Snapshot Compressive Video Captioning
Jianqiao Sun
Yudi Su
Hao Zhang
Ziheng Cheng
Zequn Zeng
Zhengjue Wang
Bo Chen
Xin Yuan
22
1
0
10 Jan 2024
VideoAdviser: Video Knowledge Distillation for Multimodal Transfer Learning
Yanan Wang
Donghuo Zeng
Shinya Wada
Satoshi Kurihara
27
6
0
27 Sep 2023
Graph-based Knowledge Distillation: A survey and experimental evaluation
Jing Liu
Tongya Zheng
Guanzheng Zhang
Qinfen Hao
19
8
0
27 Feb 2023
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
Weihong Zhong
Mao Zheng
Duyu Tang
Xuan Luo
Heng Gong
Xiaocheng Feng
Bing Qin
22
8
0
20 Feb 2023
ADAPT: Action-aware Driving Caption Transformer
Bu Jin
Xinyi Liu
Yupeng Zheng
Pengfei Li
Hao Zhao
Tong Zhang
Yuhang Zheng
Guyue Zhou
Jingjing Liu
15
69
0
01 Feb 2023
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
Peng Jin
Jinfa Huang
Fenglin Liu
Xian Wu
Shen Ge
Guoli Song
David A. Clifton
Jing Chen
VLM
16
63
0
21 Nov 2022
Visual Commonsense-aware Representation Network for Video Captioning
Pengpeng Zeng
Haonan Zhang
Lianli Gao
Xiangpeng Li
Jin Qian
Hengtao Shen
16
16
0
17 Nov 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
518
0
13 Jun 2022
Knowledge Distillation via the Target-aware Transformer
Sihao Lin
Hongwei Xie
Bing Wang
Kaicheng Yu
Xiaojun Chang
Xiaodan Liang
G. Wang
ViT
9
104
0
22 May 2022
GL-RG: Global-Local Representation Granularity for Video Captioning
Liqi Yan
Qifan Wang
Yiming Cui
Fuli Feng
Xiaojun Quan
X. Zhang
Dongfang Liu
19
59
0
22 May 2022
Support-set based Multi-modal Representation Enhancement for Video Captioning
Xiaoya Chen
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Hengtao Shen
22
4
0
19 May 2022
Cross-modal Contrastive Distillation for Instructional Activity Anticipation
Zhengyuan Yang
Jingen Liu
Jing-ling Huang
Xiaodong He
Tao Mei
Chenliang Xu
Jiebo Luo
14
6
0
18 Jan 2022
Visual-aware Attention Dual-stream Decoder for Video Captioning
Zhixin Sun
X. Zhong
Shuqin Chen
Lin Li
Luo Zhong
16
3
0
16 Oct 2021
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention
Katsuyuki Nakamura
Hiroki Ohashi
Mitsuhiro Okada
EgoV
31
12
0
07 Sep 2021
Hierarchical Object-oriented Spatio-Temporal Reasoning for Video Question Answering
Long Hoang Dang
T. Le
Vuong Le
T. Tran
25
59
0
25 Jun 2021
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval
Giorgos Kordopatis-Zilos
Christos Tzelepis
Symeon Papadopoulos
I. Kompatsiaris
Ioannis Patras
8
33
0
24 Jun 2021
Distilling EEG Representations via Capsules for Affective Computing
Guangyi Zhang
Ali Etemad
11
16
0
30 Apr 2021
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
Zhiwei Deng
Karthik Narasimhan
Olga Russakovsky
11
85
0
11 Jul 2020
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
17
2,822
0
09 Jun 2020
Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network
Bairui Wang
Lin Ma
Wei Zhang
Wenhao Jiang
Jingwen Wang
Wei Liu
66
162
0
27 Aug 2019
1