Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction

23 April 2016

Papers citing "Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction"

7 / 7 papers shown

Title
A Hierarchical Multi-Modal Encoder for Moment Localization in Video Corpus Bowen Zhang Hexiang Hu Joonseok Lee Mingde Zhao Sheide Chammas Vihan Jain Eugene Ie Fei Sha 25 30 0 18 Nov 2020
ParNet: Position-aware Aggregated Relation Network for Image-Text matching Yaxian Xia Lun Huang Wenmin Wang Xiao-Yong Wei Jie Chen 17 1 0 17 Jun 2019
Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain Johanes Effendi Andros Tjandra S. Sakti Satoshi Nakamura 14 3 0 03 Jun 2019
Image2song: Song Retrieval via Bridging Image Content and Lyric Words Xuelong Li Di Hu Xiaoqiang Lu 11 10 0 19 Aug 2017
A Survey of Multi-View Representation Learning Yingming Li Ming Yang Zhongfei Zhang AI4TS 3DV 22 509 0 03 Oct 2016
Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions F. Carrara Andrea Esuli T. Fagni Fabrizio Falchi Alejandro Moreo DiffM 16 31 0 23 Jun 2016
Improving Image Captioning by Concept-based Sentence Reranking Xirong Li Qin Jin 12 5 0 03 May 2016