Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.11438
Cited By
Reconstruction Network for Video Captioning
30 March 2018
Bairui Wang
Lin Ma
Wei Zhang
W. Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reconstruction Network for Video Captioning"
50 / 135 papers shown
Title
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
Kai Zheng
Xiaotong Chen
Odest Chadwicke Jenkins
X. Wang
LM&Ro
CoGe
14
54
0
17 Jun 2022
Egocentric Video-Language Pretraining
Kevin Qinghong Lin
Alex Jinpeng Wang
Mattia Soldan
Michael Wray
Rui Yan
...
Hongfa Wang
Dima Damen
Bernard Ghanem
Wei Liu
Mike Zheng Shou
VLM
EgoV
34
188
0
03 Jun 2022
GL-RG: Global-Local Representation Granularity for Video Captioning
Liqi Yan
Qifan Wang
Yiming Cui
Fuli Feng
Xiaojun Quan
X. Zhang
Dongfang Liu
23
59
0
22 May 2022
Video Captioning: a comparative review of where we are and which could be the route
Daniela Moctezuma
Tania A. Ramirez-delreal
Guillermo Ruiz
Othón González-Chávez
19
11
0
12 Apr 2022
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Chengpeng Dai
Fuhai Chen
Xiaoshuai Sun
Rongrong Ji
QiXiang Ye
Yongjian Wu
17
1
0
13 Mar 2022
A Review on Methods and Applications in Multimodal Deep Learning
Summaira Jabeen
Xi Li
Muhammad Shoib Amin
Abdul Jabbar
VLM
HAI
24
88
0
18 Feb 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
212
0
18 Feb 2022
An Integrated Approach for Video Captioning and Applications
Soheyla Amirian
T. Taha
Khaled Rasheed
H. Arabnia
26
1
0
23 Jan 2022
Variational Stacked Local Attention Networks for Diverse Video Captioning
Tonmoay Deb
Akib Sadmanee
Kishor Kumar
Ahsan Ali
M. Ashraful
Mahbubur Rahman
4
8
0
04 Jan 2022
Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation
Philipp Harzig
Moritz Einfalt
Rainer Lienhart
ViT
31
2
0
28 Dec 2021
Bridging the Gap: Using Deep Acoustic Representations to Learn Grounded Language from Percepts and Raw Speech
Gaoussou Youssouf Kebe
Luke E. Richards
Edward Raff
Francis Ferraro
Cynthia Matuszek
SSL
14
5
0
27 Dec 2021
Controllable Video Captioning with an Exemplar Sentence
Yitian Yuan
Lin Ma
Jingwen Wang
Wenwu Zhu
16
20
0
02 Dec 2021
Syntax Customized Video Captioning by Imitating Exemplar Sentences
Yitian Yuan
Lin Ma
Wenwu Zhu
20
6
0
02 Dec 2021
Hierarchical Modular Network for Video Captioning
Hanhua Ye
Guorong Li
Yuankai Qi
Shuhui Wang
Qingming Huang
Ming-Hsuan Yang
11
67
0
24 Nov 2021
Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks
Arulkumar Subramaniam
Jayesh Vaidya
Muhammed Ameen
Athira M. Nambiar
Anurag Mittal
19
7
0
14 Nov 2021
Visual-aware Attention Dual-stream Decoder for Video Captioning
Zhixin Sun
X. Zhong
Shuqin Chen
Lin Li
Luo Zhong
19
3
0
16 Oct 2021
CLIP4Caption: CLIP for Video Caption
Mingkang Tang
Zhanyu Wang
Zhenhua Liu
Fengyun Rao
Dian Li
Xiu Li
CLIP
VLM
27
149
0
13 Oct 2021
End-to-End Dense Video Captioning with Parallel Decoding
Teng Wang
Ruimao Zhang
Zhichao Lu
Feng Zheng
Ran Cheng
Ping Luo
3DV
38
179
0
17 Aug 2021
Cross-Modal Graph with Meta Concepts for Video Captioning
Hao Wang
Guosheng Lin
S. Hoi
C. Miao
20
6
0
14 Aug 2021
Discriminative Latent Semantic Graph for Video Captioning
Yang Bai
Junyan Wang
Yang Long
Bingzhang Hu
Yang Song
M. Pagnucco
Yu Guan
43
31
0
08 Aug 2021
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Bang-ju Yang
Shen Ge
Yuexian Zou
Xu Sun
19
32
0
05 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
30
42
0
04 Aug 2021
Boosting Video Captioning with Dynamic Loss Network
Nasib Ullah
Partha Pratim Mohanta
20
1
0
25 Jul 2021
Saying the Unseen: Video Descriptions via Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
22
6
0
26 Jun 2021
Towards Diverse Paragraph Captioning for Untrimmed Videos
Yuqing Song
Shizhe Chen
Qin Jin
8
37
0
30 May 2021
Recent Advances and Trends in Multimodal Deep Learning: A Review
Jabeen Summaira
Xi Li
Amin Muhammad Shoib
Songyuan Li
Abdul Jabbar
HAI
10
55
0
24 May 2021
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
23
7
0
01 Apr 2021
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
13
17
0
27 Mar 2021
The Role of the Input in Natural Language Video Description
S. Cascianelli
G. Costante
Alessandro Devo
Thomas Alessandro Ciarfuglia
P. Valigi
M. L. Fravolini
13
5
0
09 Feb 2021
Exploration of Visual Features and their weighted-additive fusion for Video Captioning
V. PraveenS.
Akhilesh Bharadwaj
Harsh Raj
Janhavi Dadhania
Ganesh Samarth C.A
Nikhil Pareek
S. M. I. S. R. Mahadeva Prasanna
13
1
0
14 Jan 2021
Video Captioning in Compressed Video
Mingjian Zhu
Chenrui Duan
Changbin (Brad) Yu
14
4
0
02 Jan 2021
Guidance Module Network for Video Captioning
Xiao Zhang
Chunsheng Liu
F. Chang
11
4
0
20 Dec 2020
A Comprehensive Review on Recent Methods and Challenges of Video Description
Ashutosh Kumar Singh
Thoudam Doren Singh
Sivaji Bandyopadhyay
3DV
VLM
9
5
0
30 Nov 2020
Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions
Jianan Wang
Boyang Albert Li
Xiangyu Fan
Jing-Hua Lin
Yanwei Fu
23
2
0
15 Nov 2020
Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning
Xing Yan
Weizhong Zhang
Lin Ma
W. Liu
Qi Wu
AI4TS
4
23
0
16 Oct 2020
Video captioning with stacked attention and semantic hard pull
Md. Mushfiqur Rahman
Thasinul Abedin
Khondokar S. S. Prottoy
Ayana Moshruba
Fazlul Hasan Siddiqui
17
2
0
15 Sep 2020
Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics
Jiangliu Wang
Jianbo Jiao
Linchao Bao
Shengfeng He
Wei Liu
Yunhui Liu
SSL
AI4TS
13
55
0
31 Aug 2020
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
11
13
0
18 Aug 2020
Poet: Product-oriented Video Captioner for E-commerce
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Jie Liu
Jingren Zhou
Hongxia Yang
Fei Wu
14
34
0
16 Aug 2020
Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang
Jianbo Jiao
Yunhui Liu
SSL
AI4TS
8
233
0
13 Aug 2020
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen
Wenhao Jiang
Wei Liu
Yu-Gang Jiang
23
101
0
28 Jul 2020
Fully Convolutional Networks for Continuous Sign Language Recognition
Ka Leong Cheng
Zhaoyang Yang
Qifeng Chen
Yu-Wing Tai
SLR
29
143
0
24 Jul 2020
Knowledge Graph Extraction from Videos
Louis Mahon
Eleonora Giunchiglia
Bowen Li
Thomas Lukasiewicz
9
19
0
20 Jul 2020
Learning to Discretely Compose Reasoning Module Networks for Video Captioning
Ganchao Tan
Daqing Liu
Meng Wang
Zhengjun Zha
LRM
21
73
0
17 Jul 2020
Comprehensive Information Integration Modeling Framework for Video Titling
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Tan Jiang
Jingren Zhou
Hongxia Yang
Fei Wu
21
40
0
24 Jun 2020
Action Recognition with Deep Multiple Aggregation Networks
A. Mazari
H. Sahbi
26
0
0
08 Jun 2020
Deep hierarchical pooling design for cross-granularity action recognition
A. Mazari
H. Sahbi
11
0
0
08 Jun 2020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Boxiao Pan
Haoye Cai
De-An Huang
Kuan-Hui Lee
Adrien Gaidon
Ehsan Adeli
Juan Carlos Niebles
25
235
0
31 Mar 2020
Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid Network
Jialin Gao
Zhixiang Shi
Jiani Li
Guanshuo Wang
Yufeng Yuan
Shiming Ge
Xiaoping Zhou
8
73
0
09 Mar 2020
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement
Fangyi Zhu
Jenq-Neng Hwang
Zhanyu Ma
Guang Chen
Jun Guo
14
1
0
08 Mar 2020
Previous
1
2
3
Next