Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models

10 November 2014

Papers citing "Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models"

22 / 22 papers shown

Title
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images Han-Hung Lee Yiming Zhang Angel X. Chang 3DPC 96 4 0 17 Jun 2024
Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training Xinyan Chen Jiaxin Ge Tianjun Zhang Jiaming Liu Shanghang Zhang VLM EGVM 81 0 0 23 Dec 2023
Multimodal Deep Learning Cem Akkus Jiquan Ngiam Vladana Djakovic Steffen Jauch-Walser A. Khosla ... Jann Goschenhofer Honglak Lee A. Ng Daniel Schalk Matthias Aßenmacher 64 3,161 0 12 Jan 2023
The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search Engines for Large-Scale Video Retrieval Giuseppe Amato Paolo Bolettieri F. Carrara Franca Debole Fabrizio Falchi Claudio Gennaro Lucia Vadicamo Claudio Vairo 35 17 0 06 Aug 2020
A Survey of Multi-View Representation Learning Yingming Li Ming Yang Zhongfei Zhang AI4TS 3DV 124 511 0 03 Oct 2016
Learning Language-Visual Embedding for Movie Understanding with Natural-Language Atousa Torabi Niket Tandon Leonid Sigal 45 97 0 26 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge Oriol Vinyals Alexander Toshev Samy Bengio D. Erhan 67 853 0 21 Sep 2016
Explain Images with Multimodal Recurrent Neural Networks Junhua Mao Wenyuan Xu Yi Yang Jiang Wang Alan Yuille VLM GAN 58 383 0 04 Oct 2014
Sequence to Sequence Learning with Neural Networks Ilya Sutskever Oriol Vinyals Quoc V. Le AIMat 287 20,491 0 10 Sep 2014
Recurrent Neural Network Regularization Wojciech Zaremba Ilya Sutskever Oriol Vinyals ODL 104 2,768 0 08 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan Andrew Zisserman FAtt MDE 954 99,991 0 04 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 390 27,205 0 01 Sep 2014
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping A. Karpathy Armand Joulin Li Fei-Fei VLM 64 935 0 22 Jun 2014
A Multiplicative Model for Learning Distributed Text-Based Attribute Representations Ryan Kiros R. Zemel Ruslan Salakhutdinov 48 64 0 10 Jun 2014
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Kyunghyun Cho B. V. Merrienboer Çağlar Gülçehre Dzmitry Bahdanau Fethi Bougares Holger Schwenk Yoshua Bengio AIMat 647 23,235 0 03 Jun 2014
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 272 43,290 0 01 May 2014
A Deep Architecture for Semantic Parsing Edward Grefenstette Phil Blunsom Nando de Freitas Karl Moritz Hermann SSeg 35 56 0 29 Apr 2014
Multilingual Models for Compositional Distributed Semantics Karl Moritz Hermann Phil Blunsom 67 316 0 17 Apr 2014
Multilingual Distributed Representations without Word Alignment Karl Moritz Hermann Phil Blunsom 78 156 0 20 Dec 2013
Rich feature hierarchies for accurate object detection and semantic segmentation Ross B. Girshick Jeff Donahue Trevor Darrell Jitendra Malik ObjD 224 26,122 0 11 Nov 2013
Generating Sequences With Recurrent Neural Networks Alex Graves GAN 111 4,025 0 04 Aug 2013
Efficient Estimation of Word Representations in Vector Space Tomas Mikolov Kai Chen G. Corrado J. Dean 3DV 552 31,406 0 16 Jan 2013