Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1412.4729
Cited By
v1
v2
v3 (latest)
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
North American Chapter of the Association for Computational Linguistics (NAACL), 2014
15 December 2014
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Translating Videos to Natural Language Using Deep Recurrent Neural Networks"
50 / 334 papers shown
Title
Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence
Kun Ouyang
Yuanxin Liu
Linli Yao
Yishuo Cai
Hao Zhou
Jie Zhou
Fandong Meng
Xu Sun
OffRL
LRM
ReLM
239
1
0
23 Oct 2025
TC-MGC: Text-Conditioned Multi-Grained Contrastive Learning for Text-Video Retrieval
Information Fusion (Inf. Fusion), 2025
Xiaolun Jing
Genke Yang
Jian Chu
202
2
0
07 Apr 2025
Capturing Rich Behavior Representations: A Dynamic Action Semantic-Aware Graph Transformer for Video Captioning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Caihua Liu
Xu Li
Wenjing Xue
Wei Tang
Xia Feng
159
0
0
20 Feb 2025
Multi-Modal interpretable automatic video captioning
Antoine Hanna-Asaad
Decky Aspandi
Titus Zaharia
199
1
0
11 Nov 2024
A Survey on Integrated Sensing, Communication, and Computation
IEEE Communications Surveys and Tutorials (COMST), 2024
Dingzhu Wen
Yong Zhou
Xiaoyang Li
Yuanming Shi
Kaibin Huang
Khaled B. Letaief
188
101
0
15 Aug 2024
VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It
Xiaoxuan Zhu
Zhouhong Gu
Sihang Jiang
Zhixu Li
Hongwei Feng
Yanghua Xiao
196
0
0
15 Jun 2024
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
295
60
0
26 May 2024
An Empirical Study of Excitation and Aggregation Design Adaptions in CLIP4Clip for Video-Text Retrieval
Xiaolun Jing
Genke Yang
Jian Chu
CLIP
179
2
0
25 May 2024
MICap: A Unified Model for Identity-aware Movie Descriptions
Computer Vision and Pattern Recognition (CVPR), 2024
Haran Raajesh
Naveen Reddy Desanur
Zeeshan Khan
Makarand Tapaswi
208
7
0
19 May 2024
Sentiment-oriented Transformer-based Variational Autoencoder Network for Live Video Commenting
Fengyi Fu
Shancheng Fang
Weidong Chen
Zhendong Mao
ViT
VGen
131
5
0
19 Apr 2024
Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis
Maged Shoman
Dongdong Wang
Armstrong Aboah
Mohamed Abdel-Aty
146
18
0
12 Apr 2024
Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval
Minkuk Kim
Hyeon Bae Kim
Jinyoung Moon
Jinwoo Choi
Seong Tae Kim
119
38
0
11 Apr 2024
VideoDistill: Language-aware Vision Distillation for Video Question Answering
Bo Zou
Chao Yang
Yu Qiao
Chengbin Quan
Youjian Zhao
VGen
179
3
0
01 Apr 2024
Cross-Modal Reasoning with Event Correlation for Video Question Answering
Chengxiang Yin
Zhengping Che
Kun Wu
Zhiyuan Xu
Qinru Qiu
Jian Tang
166
0
0
20 Dec 2023
Attention Based Encoder Decoder Model for Video Captioning in Nepali (2023)
Kabita Parajuli
S. R. Joshi
211
0
0
12 Dec 2023
Multi Sentence Description of Complex Manipulation Action Videos
Machine Vision and Applications (MVA), 2023
Fatemeh Ziaeetabar
Reza Safabakhsh
S. Momtazi
M. Tamosiunaite
Florentin Wörgötter
172
7
0
13 Nov 2023
CLearViD: Curriculum Learning for Video Description
Cheng-Yu Chuang
Pooyan Fazli
135
1
0
08 Nov 2023
Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yimu Wang
Xiangru Jian
Bo Xue
166
21
0
17 Oct 2023
CLEVRER-Humans: Describing Physical and Causal Events the Human Way
Neural Information Processing Systems (NeurIPS), 2023
Jiayuan Mao
Xuelin Yang
Xikun Zhang
Noah D. Goodman
Jiajun Wu
NAI
249
22
0
05 Oct 2023
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai-Nguyen Nguyen
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
252
20
0
23 Sep 2023
Collaborative Three-Stream Transformers for Video Captioning
Computer Vision and Image Understanding (CVIU), 2023
Hao Wang
Libo Zhang
Hengrui Fan
Tiejian Luo
127
8
0
18 Sep 2023
Video Captioning with Aggregated Features Based on Dual Graphs and Gated Fusion
Yutao Jin
Yinan Han
Jing Wang
121
1
0
13 Aug 2023
VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation
Xilun Chen
L. Yu
Wenhan Xiong
Barlas Ouguz
Yashar Mehdad
Anuj Kumar
VGen
127
4
0
04 May 2023
Visual Transformation Telling
Wanqing Cui
Mustafa Nasir-Moin
Yanyan Lan
Viola J. Chen
Jiafeng Guo
Xueqi Cheng
LRM
210
4
0
03 May 2023
A Review of Deep Learning for Video Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Moloud Abdar
Meenakshi Kollati
Swaraja Kuraparthi
Farhad Pourpanah
Daniel J. McDuff
...
Shuicheng Yan
Abduallah A. Mohamed
Abbas Khosravi
Xiaoshi Zhong
Fatih Porikli
3DV
189
35
0
22 Apr 2023
Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation
IEEE International Conference on Computer Vision (ICCV), 2023
Yaowei Li
Bang-ju Yang
Xuxin Cheng
Zhihong Zhu
Hongxiang Li
Yuexian Zou
308
41
0
28 Mar 2023
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2023
Peng Jin
Jinfa Huang
Pengfei Xiong
Shangxuan Tian
Chang-rui Liu
Xiang Ji
Li-ming Yuan
Jie Chen
198
76
0
25 Mar 2023
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Bang-ju Yang
Fenglin Liu
Yuexian Zou
Xian Wu
Yaowei Wang
David Clifton
153
12
0
11 Mar 2023
Collaboration with Conversational AI Assistants for UX Evaluation: Questions and How to Ask them (Voice vs. Text)
International Conference on Human Factors in Computing Systems (CHI), 2023
Emily Kuang
Ehsan Jahangirzadeh Soure
Mingming Fan
J. Zhao
Kristen Shinohara
ELM
127
34
0
07 Mar 2023
TAP: The Attention Patch for Cross-Modal Knowledge Transfer from Unlabeled Modality
Yinsong Wang
Shahin Shahrampour
159
0
0
04 Feb 2023
ADAPT: Action-aware Driving Caption Transformer
IEEE International Conference on Robotics and Automation (ICRA), 2023
Bu Jin
Xinyi Liu
Yupeng Zheng
Pengfei Li
Hao Zhao
Tong Zhang
Yuhang Zheng
Guyue Zhou
Jingjing Liu
323
91
0
01 Feb 2023
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yizhen Chen
Jie Wang
Lijian Lin
Chen Ma
Jin Ma
Ying Shan
VLM
205
33
0
30 Jan 2023
METEOR Guided Divergence for Video Captioning
IEEE International Joint Conference on Neural Network (IJCNN), 2022
D. Rothenpieler
Shahin Amiriparian
128
3
0
20 Dec 2022
MAViC: Multimodal Active Learning for Video Captioning
Gyanendra Das
Xavier Thomas
Anant Raj
Vikram Gupta
123
3
0
11 Dec 2022
Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning
AAAI Conference on Artificial Intelligence (AAAI), 2022
Zhuo Zhou
Zipeng Li
Shuqin Chen
Kui Jiang
Chen Chen
Mang Ye
DiffM
VGen
188
57
0
28 Nov 2022
Aligning Source Visual and Target Language Domains for Unpaired Video Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Fenglin Liu
Xian Wu
Chenyu You
Shen Ge
Yuexian Zou
Xu Sun
181
26
0
22 Nov 2022
Visual Commonsense-aware Representation Network for Video Captioning
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Pengpeng Zeng
Haonan Zhang
Lianli Gao
Xiangpeng Li
Jin Qian
Hengtao Shen
133
20
0
17 Nov 2022
Contrastive Video-Language Learning with Fine-grained Frame Sampling
Zixu Wang
Yujie Zhong
Yishu Miao
Lin Ma
Lucia Specia
183
15
0
10 Oct 2022
Thinking Hallucination for Video Captioning
Asian Conference on Computer Vision (ACCV), 2022
Nasib Ullah
Partha Pratim Mohanta
VLM
143
9
0
28 Sep 2022
Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval
Neural Information Processing Systems (NeurIPS), 2022
Che-Hsien Lin
Ancong Wu
Junwei Liang
Jun Zhang
Wenhang Ge
Wei Zheng
Chunhua Shen
201
37
0
27 Sep 2022
LGDN: Language-Guided Denoising Network for Video-Language Modeling
Neural Information Processing Systems (NeurIPS), 2022
Haoyu Lu
Mingyu Ding
Nanyi Fei
Yuqi Huo
Zhiwu Lu
VLM
223
18
0
23 Sep 2022
Large-Scale Traffic Congestion Prediction based on Multimodal Fusion and Representation Mapping
International Conference on Data Science and Advanced Analytics (DSAA), 2022
Bo Zhou
Jiahui Liu
Songyi Cui
Yaping Zhao
165
5
0
23 Aug 2022
Diverse Video Captioning by Adaptive Spatio-temporal Attention
German Conference on Pattern Recognition (GCPR), 2022
Zohreh Ghaderi
Leonard Salewski
Hendrik P. A. Lensch
104
11
0
19 Aug 2022
Sports Video Analysis on Large-Scale Data
European Conference on Computer Vision (ECCV), 2022
Dekun Wu
Henghui Zhao
Xingce Bao
Richard P. Wildes
106
22
0
09 Aug 2022
LocVTP: Video-Text Pre-training for Temporal Localization
European Conference on Computer Vision (ECCV), 2022
Meng Cao
Tianyu Yang
Junwu Weng
Can Zhang
Jue Wang
Yuexian Zou
169
69
0
21 Jul 2022
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
ACM Multimedia (ACM MM), 2022
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Ming Yan
Ji Zhang
Rongrong Ji
CLIP
VLM
231
385
0
15 Jul 2022
LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Computer Vision and Pattern Recognition (CVPR), 2022
Linjie Li
Zhe Gan
Kevin Qinghong Lin
Chung-Ching Lin
Zicheng Liu
Ce Liu
Lijuan Wang
MLLM
VLM
159
91
0
14 Jun 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
514
690
0
27 May 2022
Visual Abductive Reasoning
Computer Vision and Pattern Recognition (CVPR), 2022
Chen Liang
Wenguan Wang
Tianfei Zhou
Yi Yang
LRM
149
46
0
26 Mar 2022
Revitalize Region Feature for Democratizing Video-Language Pre-training of Retrieval
Guanyu Cai
Yixiao Ge
Binjie Zhang
Alex Jinpeng Wang
Rui Yan
...
Ying Shan
Lianghua He
Xiaohu Qie
Jianping Wu
Mike Zheng Shou
VLM
134
6
0
15 Mar 2022
1
2
3
4
5
6
7
Next