ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.04870
  4. Cited By
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for
  Richer Image-to-Sentence Models
v1v2v3v4 (latest)

Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models

19 May 2015
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Anjali Narayan-Chen
Svetlana Lazebnik
ArXiv (abs)PDFHTML

Papers citing "Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models"

25 / 1,325 papers shown
Utilizing Large Scale Vision and Text Datasets for Image Segmentation
  from Referring Expressions
Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions
Ronghang Hu
Marcus Rohrbach
Subhashini Venugopalan
Trevor Darrell
VLM
147
18
0
30 Aug 2016
Solving Visual Madlibs with Multiple Cues
Solving Visual Madlibs with Multiple Cues
Tatiana Tommasi
Arun Mallya
Bryan A. Plummer
Svetlana Lazebnik
Alexander C. Berg
Tamara L. Berg
213
18
0
11 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
305
231
0
01 Aug 2016
Top-down Neural Attention by Excitation Backprop
Top-down Neural Attention by Excitation Backprop
Jianming Zhang
Zhe Lin
Jonathan Brandt
Xiaohui Shen
Stan Sclaroff
343
994
0
01 Aug 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
448
2,175
0
29 Jul 2016
CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation
  Tasks
CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation TasksConference on Machine Translation (WMT), 2016
Jindrich Libovický
Jindřich Helcl
Marek Tlustý
Pavel Pecina
Ondrej Bojar
151
68
0
23 Jun 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual GroundingConference on Empirical Methods in Natural Language Processing (EMNLP), 2016
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
622
1,545
0
06 Jun 2016
Attention Correctness in Neural Image Captioning
Attention Correctness in Neural Image CaptioningAAAI Conference on Artificial Intelligence (AAAI), 2016
Chenxi Liu
Junhua Mao
Fei Sha
Alan Yuille
3DV
228
225
0
31 May 2016
Stereotyping and Bias in the Flickr30K Dataset
Stereotyping and Bias in the Flickr30K Dataset
Emiel van Miltenburg
163
95
0
19 May 2016
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
247
104
0
09 May 2016
Attributes as Semantic Units between Natural Language and Visual
  Recognition
Attributes as Semantic Units between Natural Language and Visual Recognition
Marcus Rohrbach
VLM
126
4
0
12 Apr 2016
Automatic Annotation of Structured Facts in Images
Automatic Annotation of Structured Facts in Images
Mohamed Elhoseiny
Scott D. Cohen
W. Chang
Brian L. Price
Ahmed Elgammal
174
9
0
02 Apr 2016
Segmentation from Natural Language Expressions
Segmentation from Natural Language Expressions
Ronghang Hu
Marcus Rohrbach
Trevor Darrell
VLMEgoV
267
509
0
20 Mar 2016
RNN Fisher Vectors for Action Recognition and Image Annotation
RNN Fisher Vectors for Action Recognition and Image Annotation
Guy Lev
Gil Sadeh
Benjamin Klein
Lior Wolf
148
169
0
12 Dec 2015
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
379
1,218
0
24 Nov 2015
Order-Embeddings of Images and Language
Order-Embeddings of Images and Language
Ivan Vendrov
Ryan Kiros
Sanja Fidler
R. Urtasun
414
576
0
19 Nov 2015
Learning Deep Structure-Preserving Image-Text Embeddings
Learning Deep Structure-Preserving Image-Text Embeddings
Liwei Wang
Yin Li
Svetlana Lazebnik
483
822
0
19 Nov 2015
Sherlock: Scalable Fact Learning in Images
Sherlock: Scalable Fact Learning in Images
Mohamed Elhoseiny
Scott D. Cohen
W. Chang
Brian L. Price
Ahmed Elgammal
211
26
0
16 Nov 2015
Natural Language Object Retrieval
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
338
570
0
13 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
387
511
0
12 Nov 2015
Visual7W: Grounded Question Answering in Images
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
531
965
0
11 Nov 2015
Neural Module Networks
Neural Module Networks
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
CoGe
668
1,139
0
09 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
725
1,562
0
07 Nov 2015
Multimodal Convolutional Neural Networks for Matching Image and Sentence
Multimodal Convolutional Neural Networks for Matching Image and Sentence
Lin Ma
Zhengdong Lu
Lifeng Shang
Hang Li
339
348
0
23 Apr 2015
Show and Tell: A Neural Image Caption Generator
Show and Tell: A Neural Image Caption GeneratorComputer Vision and Pattern Recognition (CVPR), 2014
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
719
6,395
0
17 Nov 2014
Previous
123...252627
Page 27 of 27
Pageof 27