v1v2 (latest)

From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning

IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2017

8 August 2017

Jingkuan Song

Yuyu Guo

Lianli Gao

Xuelong Li

Alan Hanjalic

Heng Tao Shen

ArXiv (abs)PDF HTML

Papers citing "From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning"

31 / 31 papers shown

A Statistical Framework for Model Selection in LSTM Networks

Fahad Mostafa

116

07 Jun 2025

EVC-MF: End-to-end Video Captioning Network with Multi-scale Features

288

22 Oct 2024

How to Understand Named Entities: Using Common Sense for News CaptioningACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) (TOMCCAP), 2024

234

11 Mar 2024

Video ReCap: Recursive Captioning of Hour-Long Videos

Gedas Bertasius

793

20 Feb 2024

SEM-POS: Grammatically and Semantically Correct Video Captioning

259

26 Mar 2023

Visual Commonsense-aware Representation Network for Video CaptioningIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022

Pengpeng Zeng

Haonan Zhang

Lianli Gao

Xiangpeng Li

Jin Qian

Hengtao Shen

204

17 Nov 2022

Hybrid Reinforced Medical Report Generation with M-Linear Attention and Repetition PenaltyIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022

252

14 Oct 2022

Structured Two-stream Attention Network for Video Question AnsweringAAAI Conference on Artificial Intelligence (AAAI), 2019

Lianli Gao

Jingkuan Song

Tao Mei

311

02 Jun 2022

Video Captioning: a comparative review of where we are and which could be the routeComputer Vision and Image Understanding (CVIU), 2022

Daniela Moctezuma

Tania A. Ramirez-delreal

Guillermo Ruiz

Othón González-Chávez

302

12 Apr 2022

NeuroView-RNN: It's About TimeConference on Fairness, Accountability and Transparency (FAccT), 2022

Sina Alemohammad

Richard G. Baraniuk

290

23 Feb 2022

One-shot Scene Graph GenerationACM Multimedia (ACM MM), 2020

Yuyu Guo

Jingkuan Song

Lianli Gao

Heng Tao Shen

249

22 Feb 2022

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New DirectionsMachine Intelligence Research (MIR), 2021

422

30 Aug 2021

A Comprehensive Review of the Video-to-Text ProblemArtificial Intelligence Review (AIR), 2021

314

27 Mar 2021

The Role of the Input in Natural Language Video DescriptionIEEE transactions on multimedia (TMM), 2020

S. Cascianelli

G. Costante

Alessandro Devo

Thomas Alessandro Ciarfuglia

P. Valigi

M. L. Fravolini

220

09 Feb 2021

Guidance Module Network for Video CaptioningCybersecurity and Cyberforensics Conference (CC), 2020

Xiao Zhang

Chunsheng Liu

F. Chang

122

20 Dec 2020

Universal Weighting Metric Learning for Cross-Modal Matching

Jiwei Wei

Xing Xu

Yang Yang

Yanli Ji

Zheng Wang

Heng Tao Shen

193

100

07 Oct 2020

Unsupervised Online Anomaly Detection On Irregularly Sampled Or Missing Valued Time-Series Data Using LSTM Networks

Oguzhan Karaahmetoglu

Fatih Ilhan

Ismail Balaban

Suleyman S. Kozat

AI4TS

204

25 May 2020

Towards Embodied Scene Description

Sinan Tan

178

30 Apr 2020

Learning Selective Sensor Fusion for States EstimationIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2019

263

30 Dec 2019

Characterizing the impact of using features extracted from pre-trained models on the quality of video captioning sequence-to-sequence modelsInternational Conferences on Pattern Recognition and Artificial Intelligence (ICCPRAI), 2019

Menatallh Hammad

May Hammad

Mohamed Elshenawy

127

22 Nov 2019

Video Captioning with Text-based Dynamic Attention and Step-by-Step LearningPattern Recognition Letters (PR), 2019

Huanhou Xiao

Jinglun Shi

172

05 Nov 2019

Diverse Video Captioning Through Latent Variable ExpansionPattern Recognition Letters (PR), 2019

Huanhou Xiao

Jinglun Shi

DiffM

415

26 Oct 2019

Multimodal Unified Attention Networks for Vision-and-Language Interactions

300

12 Aug 2019

Adaptive Exploration for Unsupervised Person Re-Identification

Hehe Fan

267

134

09 Jul 2019

Object-aware Aggregation with Bidirectional Temporal Graph for Video CaptioningComputer Vision and Pattern Recognition (CVPR), 2019

Junchao Zhang

Yuxin Peng

277

190

11 Jun 2019

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Jingkuan Song

Xiangpeng Li

Lianli Gao

Heng Tao Shen

200

234

26 Dec 2018

The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers

Lei Wang

261

143

22 Aug 2018

Video Captioning with Boundary-aware Hierarchical Language Decoding and Joint Video Prediction

Xiangxi Shi

Jianfei Cai

Jiuxiang Gu

Shafiq Joty

190

08 Jul 2018

COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval

462

183

22 May 2018

Less Is More: Picking Informative Frames for Video Captioning

265

207

05 Mar 2018

Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder

Jingkuan Song

Lianli Gao

218

266

07 Feb 2018