ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.02283
  4. Cited By
Generation and Comprehension of Unambiguous Object Descriptions
v1v2v3 (latest)

Generation and Comprehension of Unambiguous Object Descriptions

7 November 2015
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
    ObjD
ArXiv (abs)PDFHTMLGithub (164★)

Papers citing "Generation and Comprehension of Unambiguous Object Descriptions"

50 / 917 papers shown
Title
Scene Graph Parsing as Dependency Parsing
Scene Graph Parsing as Dependency Parsing
Yu-Siang Wang
Chenxi Liu
Fangyin Wei
Alan Yuille
GNN3DV
99
59
0
25 Mar 2018
Video Object Segmentation with Language Referring Expressions
Video Object Segmentation with Language Referring Expressions
Anna Khoreva
Anna Rohrbach
Bernt Schiele
VOS
220
238
0
21 Mar 2018
Actor and Action Video Segmentation from a Sentence
Actor and Action Video Segmentation from a Sentence
Kirill Gavrilyuk
Amir Ghodrati
Zhenyang Li
Cees G. M. Snoek
VLM
163
184
0
20 Mar 2018
Object Captioning and Retrieval with Natural Language
Object Captioning and Retrieval with Natural Language
A. Nguyen
Thanh-Toan Do
Ian Reid
D. Caldwell
Nikos G. Tsagarakis
3DV
95
20
0
16 Mar 2018
Discriminability objective for training descriptive captions
Discriminability objective for training descriptive captions
Ruotian Luo
Brian L. Price
Scott D. Cohen
Gregory Shakhnarovich
269
208
0
12 Mar 2018
Answerer in Questioner's Mind: Information Theoretic Approach to
  Goal-Oriented Visual Dialog
Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog
Sang-Woo Lee
Y. Heo
Byoung-Tak Zhang
187
32
0
12 Feb 2018
MAttNet: Modular Attention Network for Referring Expression
  Comprehension
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
421
905
0
24 Jan 2018
Object Referring in Videos with Language and Human Gaze
Object Referring in Videos with Language and Human Gaze
A. Vasudevan
Dengxin Dai
Luc Van Gool
VOS
185
82
0
04 Jan 2018
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Yonatan Bisk
Kevin J. Shih
Yejin Choi
D. Marcu
112
63
0
10 Dec 2017
Grounding Referring Expressions in Images by Variational Context
Grounding Referring Expressions in Images by Variational Context
Hanwang Zhang
Yulei Niu
Shih-Fu Chang
BDLObjD
192
236
0
05 Dec 2017
Discriminative Learning of Open-Vocabulary Object Retrieval and
  Localization by Negative Phrase Augmentation
Discriminative Learning of Open-Vocabulary Object Retrieval and Localization by Negative Phrase Augmentation
Ryota Hinami
Shiníchi Satoh
ObjD
103
23
0
27 Nov 2017
Self-view Grounding Given a Narrated 360° Video
Self-view Grounding Given a Narrated 360° Video
Shih-Han Chou
Yi-Chun Chen
Kuo-Hao Zeng
Hou-Ning Hu
Jianlong Fu
Min Sun
68
4
0
23 Nov 2017
Conditional Image-Text Embedding Networks
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
288
124
0
22 Nov 2017
Vision-and-Language Navigation: Interpreting visually-grounded
  navigation instructions in real environments
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
543
1,531
0
20 Nov 2017
Parallel Attention: A Unified Framework for Visual Object Discovery
  through Dialogs and Queries
Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Bohan Zhuang
Qi Wu
Chunhua Shen
Ian Reid
Anton Van Den Hengel
ObjD
151
142
0
17 Nov 2017
Unified Pragmatic Models for Generating and Following Instructions
Unified Pragmatic Models for Generating and Following Instructions
Daniel Fried
Jacob Andreas
Dan Klein
LRM
227
125
0
14 Nov 2017
Object Referring in Visual Scene with Spoken Language
Object Referring in Visual Scene with Spoken Language
A. Vasudevan
Dengxin Dai
Luc Van Gool
173
19
0
10 Nov 2017
Semantic Image Retrieval via Active Grounding of Visual Situations
Semantic Image Retrieval via Active Grounding of Visual Situations
Max H. Quinn
E. Conser
Jordan M. Witte
Melanie Mitchell
116
10
0
31 Oct 2017
Interactively Picking Real-World Objects with Unconstrained Spoken
  Language Instructions
Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions
Jun Hatori
Yuta Kikuchi
Sosuke Kobayashi
K. Takahashi
Yuta Tsuboi
Y. Unno
W. Ko
Jethro Tan
174
168
0
17 Oct 2017
Contrastive Learning for Image Captioning
Contrastive Learning for Image Captioning
Bo Dai
Dahua Lin
SSLVLM
146
202
0
06 Oct 2017
Learning Functional Causal Models with Generative Neural Networks
Learning Functional Causal Models with Generative Neural Networks
Hugo Jair Escalante
Sergio Escalera
Xavier Baro
Isabelle M Guyon
Umut Güçlü
Marcel van Gerven
CMLBDL
358
110
0
15 Sep 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training
  dataset for image captioning
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning
Yang Xian
Yingli Tian
VLM
187
23
0
15 Sep 2017
Reasoning about Fine-grained Attribute Phrases using Reference Games
Reasoning about Fine-grained Attribute Phrases using Reference Games
Jong-Chyi Su
Chenyun Wu
Huaizu Jiang
Subhransu Maji
177
16
0
29 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised
  Attention in VQA and Question-Focused Semantic Segmentation
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic SegmentationIEEE International Conference on Computer Vision (ICCV), 2017
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
231
135
0
15 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions
Learning to Disambiguate by Asking Discriminative QuestionsIEEE International Conference on Computer Vision (ICCV), 2017
Yining Li
Chen Huang
Xiaoou Tang
Chen Change Loy
149
22
0
09 Aug 2017
Localizing Moments in Video with Natural Language
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
381
1,096
0
04 Aug 2017
Discover and Learn New Objects from Documentaries
Discover and Learn New Objects from Documentaries
Kai-xiang Chen
Hang Song
Chen Change Loy
Dahua Lin
ObjD
156
20
0
30 Jul 2017
Weakly-supervised learning of visual relations
Weakly-supervised learning of visual relations
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
160
197
0
29 Jul 2017
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600
  Papers Survey
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
Hirokatsu Kataoka
Soma Shirakabe
Yun He
S. Ueta
Teppei Suzuki
...
Ryousuke Takasawa
Masataka Fuchida
Yudai Miyashita
Kazushige Okayasu
Yuta Matsuzaki
209
1
0
20 Jul 2017
Grounding Spatio-Semantic Referring Expressions for Human-Robot
  Interaction
Grounding Spatio-Semantic Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
ObjD
149
21
0
18 Jul 2017
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
464
3,530
0
26 May 2017
TALL: Temporal Activity Localization via Language Query
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
431
986
0
05 May 2017
Spatio-temporal Person Retrieval via Natural Language Queries
Spatio-temporal Person Retrieval via Natural Language Queries
Masataka Yamaguchi
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
217
62
0
26 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection
  with Natural Language Queries
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Xicheng Zhang
Luyao Yuan
Yijie Guo
Zhiyuan He
I-An Huang
Honglak Lee
ObjD
144
59
0
12 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li Li
134
333
0
12 Apr 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question
  Answering
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
V. Kazemi
Ali Elqursh
OOD
148
193
0
11 Apr 2017
Generating Descriptions with Grounded and Co-Referenced People
Generating Descriptions with Grounded and Co-Referenced People
Anna Rohrbach
Marcus Rohrbach
Siyu Tang
Seong Joon Oh
Bernt Schiele
552
72
0
05 Apr 2017
Aligned Image-Word Representations Improve Inductive Transfer Across
  Vision-Language Tasks
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
Tanmay Gupta
Kevin J. Shih
Saurabh Singh
Derek Hoiem
245
26
0
02 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core
  tasks, applications and evaluation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MAELM
381
870
0
29 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
205
268
0
23 Mar 2017
An End-to-End Approach to Natural Language Object Retrieval via
  Context-Aware Deep Reinforcement Learning
An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning
Fan Wu
Zhongwen Xu
Yi Yang
ObjD
112
11
0
22 Mar 2017
Unsupervised Visual-Linguistic Reference Resolution in Instructional
  Videos
Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos
De-An Huang
Joseph J. Lim
Li Fei-Fei
Juan Carlos Niebles
170
55
0
07 Mar 2017
Comprehension-guided referring expressions
Comprehension-guided referring expressionsComputer Vision and Pattern Recognition (CVPR), 2017
Ruotian Luo
Gregory Shakhnarovich
ObjD
171
180
0
12 Jan 2017
Context-aware Captions from Context-agnostic Supervision
Context-aware Captions from Context-agnostic SupervisionComputer Vision and Pattern Recognition (CVPR), 2017
Ramakrishna Vedantam
Samy Bengio
Kevin Patrick Murphy
Devi Parikh
Gal Chechik
221
153
0
11 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
A Joint Speaker-Listener-Reinforcer Model for Referring ExpressionsComputer Vision and Pattern Recognition (CVPR), 2016
Licheng Yu
Hao Tan
Joey Tianyi Zhou
Tamara L. Berg
ObjD
184
288
0
30 Dec 2016
Top-down Visual Saliency Guided by Captions
Top-down Visual Saliency Guided by CaptionsComputer Vision and Pattern Recognition (CVPR), 2016
Vasili Ramanishka
Abir Das
Jianming Zhang
Kate Saenko
163
147
0
21 Dec 2016
ImageNet pre-trained models with batch normalization
ImageNet pre-trained models with batch normalization
Marcel Simon
E. Rodner
Joachim Denzler
VLMSSeg
187
170
0
05 Dec 2016
Modeling Relationships in Referential Expressions with Compositional
  Modular Networks
Modeling Relationships in Referential Expressions with Compositional Modular Networks
Ronghang Hu
Marcus Rohrbach
Jacob Andreas
Trevor Darrell
Kate Saenko
181
420
0
30 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue
GuessWhat?! Visual object discovery through multi-modal dialogue
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
171
436
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li Li
VLM
210
177
0
21 Nov 2016
Previous
123...171819
Next