Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1506.01698
Cited By
The Long-Short Story of Movie Description
4 June 2015
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Long-Short Story of Movie Description"
17 / 17 papers shown
Title
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
Leqi Shen
Tianxiang Hao
Tao He
Sicheng Zhao
Pengzhang Liu
Yongjun Bao
Guiguang Ding
Guiguang Ding
132
7
0
02 Sep 2024
Text-Video Retrieval with Global-Local Semantic Consistent Learning
Haonan Zhang
Pengpeng Zeng
Lianli Gao
Jingkuan Song
Yihang Duan
Xinyu Lyu
Hengtao Shen
VLM
CLIP
37
2
0
21 May 2024
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
Weihong Zhong
Mao Zheng
Duyu Tang
Xuan Luo
Heng Gong
Xiaocheng Feng
Bing Qin
27
8
0
20 Feb 2023
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval
Yizhen Chen
Jie Wang
Lijian Lin
Zhongang Qi
Jin Ma
Ying Shan
VLM
21
18
0
30 Jan 2023
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
Siteng Huang
Biao Gong
Yulin Pan
Jianwen Jiang
Yiliang Lv
Yuyuan Li
Donglin Wang
VLM
VPVLM
22
41
0
23 Nov 2022
M2HF: Multi-level Multi-modal Hybrid Fusion for Text-Video Retrieval
Shuo Liu
Weize Quan
Mingyuan Zhou
Sihong Chen
Jian Kang
Zhenlan Zhao
Chen Chen
Dong-Ming Yan
20
0
0
16 Aug 2022
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Ming Yan
Ji Zhang
Rongrong Ji
CLIP
VLM
17
268
0
15 Jul 2022
Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Chiori Hori
Takaaki Hori
Jonathan Le Roux
17
4
0
04 Aug 2021
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Han Fang
Pengfei Xiong
Luhui Xu
Yu Chen
CLIP
VLM
13
291
0
21 Jun 2021
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
15
2,856
0
26 May 2017
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
48
1,214
0
02 May 2017
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
6
229
0
10 Oct 2016
Title Generation for User Generated Videos
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
27
69
0
25 Aug 2016
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
32
353
0
12 May 2016
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
Lisa Anne Hendricks
Subhashini Venugopalan
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Trevor Darrell
CoGe
16
284
0
17 Nov 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
38
534
0
07 May 2015
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton
Nitish Srivastava
A. Krizhevsky
Ilya Sutskever
Ruslan Salakhutdinov
VLM
266
7,634
0
03 Jul 2012
1