Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1511.02283
Cited By
v1
v2
v3 (latest)
Generation and Comprehension of Unambiguous Object Descriptions
7 November 2015
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Github (164★)
Papers citing
"Generation and Comprehension of Unambiguous Object Descriptions"
50 / 917 papers shown
Title
Towards Abstract Relational Learning in Human Robot Interaction
Mohamadreza Faridghasemnia
Daniele Nardi
A. Saffiotti
96
2
0
20 Nov 2020
Where Are You? Localization from Embodied Dialog
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Meera Hahn
Jacob Krantz
Dhruv Batra
Devi Parikh
James M. Rehg
Stefan Lee
Peter Anderson
LM&Ro
178
33
0
16 Nov 2020
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Findings (Findings), 2020
Hyounghun Kim
Abhaysinh Zala
Graham Burri
Hao Tan
Joey Tianyi Zhou
LM&Ro
177
17
0
15 Nov 2020
Lessons from Computational Modelling of Reference Production in Mandarin and English
International Conference on Natural Language Generation (INLG), 2020
Guanyi Chen
Kees van Deemter
147
5
0
14 Nov 2020
Human-centric Spatio-Temporal Video Grounding With Visual Transformers
Zongheng Tang
Yue Liao
Si Liu
Guanbin Li
Xiaojie Jin
Hongxu Jiang
Qian Yu
Dong Xu
176
125
0
10 Nov 2020
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Ece Takmaz
Mario Giulianelli
Sandro Pezzelle
Arabella J. Sinclair
Raquel Fernández
124
30
0
09 Nov 2020
Utilizing Every Image Object for Semi-supervised Phrase Grounding
Haidong Zhu
Arka Sadhu
Zhao-Heng Zheng
Ram Nevatia
ObjD
140
8
0
05 Nov 2020
Actor and Action Modular Network for Text-based Video Segmentation
IEEE Transactions on Image Processing (TIP), 2020
Jianhua Yang
Yan Huang
K. Niu
Linjiang Huang
Zhanyu Ma
Liang Wang
192
13
0
02 Nov 2020
Explaining Deep Neural Networks
Oana-Maria Camburu
XAI
FAtt
296
30
0
04 Oct 2020
Linguistic Structure Guided Context Modeling for Referring Image Segmentation
European Conference on Computer Vision (ECCV), 2020
Tianrui Hui
Si Liu
Shaofei Huang
Guanbin Li
Sansi Yu
Faxi Zhang
Jizhong Han
321
176
0
01 Oct 2020
Referring Image Segmentation via Cross-Modal Progressive Comprehension
Computer Vision and Pattern Recognition (CVPR), 2020
Shaofei Huang
Tianrui Hui
Si Liu
Guanbin Li
Yunchao Wei
Jizhong Han
Luoqi Liu
Yue Liu
EgoV
227
208
0
01 Oct 2020
RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation
Míriam Bellver
Carles Ventura
Carina Silberer
Ioannis V. Kazakos
Jordi Torres
Xavier Giró-i-Nieto
VOS
161
37
0
01 Oct 2020
Commands 4 Autonomous Vehicles (C4AV) Workshop Summary
Thierry Deruyttere
Simon Vandenhende
Dusan Grujicic
Yu Liu
Luc Van Gool
Matthew Blaschko
Tinne Tuytelaars
Marie-Francine Moens
199
6
0
18 Sep 2020
Ground-truth or DAER: Selective Re-query of Secondary Information
IEEE International Conference on Computer Vision (ICCV), 2020
Stephan J. Lemmer
Jason J. Corso
187
4
0
16 Sep 2020
Towards Unique and Informative Captioning of Images
European Conference on Computer Vision (ECCV), 2020
Zeyu Wang
Berthy Feng
Karthik Narasimhan
Olga Russakovsky
141
38
0
08 Sep 2020
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2020
Long Chen
Wenbo Ma
Jun Xiao
Hanwang Zhang
Shih-Fu Chang
ObjD
233
111
0
03 Sep 2020
Generating Adjacency Matrix for Video Relocalization
Yuanen Zhou
Mingfei Wang
Ruolin Wang
Shuwei Huo
118
0
0
19 Aug 2020
Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos
Xiaoye Qu
Peng Tang
Zhikang Zhou
Yu Cheng
Jianfeng Dong
Pan Zhou
231
101
0
06 Aug 2020
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization
Daizong Liu
Xiaoye Qu
Xiao-Yang Liu
Jianfeng Dong
Pan Zhou
Zichuan Xu
213
144
0
04 Aug 2020
PhraseCut: Language-based Image Segmentation in the Wild
Chenyun Wu
Zhe Lin
Scott D. Cohen
Trung Bui
Subhransu Maji
VLM
198
133
0
03 Aug 2020
Describing Textures using Natural Language
Chenyun Wu
Mikayla Timm
Subhransu Maji
3DV
140
14
0
03 Aug 2020
Learning to Read and Follow Music in Complete Score Sheet Images
Florian Henkel
Rainer Kelz
Gerhard Widmer
99
12
0
21 Jul 2020
Fine-Grained Image Captioning with Global-Local Discriminative Objective
Jie Wu
Tianshui Chen
Hefeng Wu
Zhi Yang
Guangchun Luo
Liang Lin
158
72
0
21 Jul 2020
Graph Neural Network for Video Relocalization
Yuanen Zhou
Mingfei Wang
Ruolin Wang
Shuwei Huo
141
0
0
20 Jul 2020
Referring Expression Comprehension: A Survey of Methods and Datasets
IEEE transactions on multimedia (TMM), 2020
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
282
116
0
19 Jul 2020
Visual Relation Grounding in Videos
European Conference on Computer Vision (ECCV), 2020
Junbin Xiao
Xindi Shang
Xun Yang
Sheng Tang
Tat-Seng Chua
226
45
0
17 Jul 2020
Program Synthesis with Pragmatic Communication
Neural Information Processing Systems (NeurIPS), 2020
Yewen Pu
Kevin Ellis
Marta Kryven
J. Tenenbaum
Armando Solar-Lezama
214
24
0
09 Jul 2020
Explanation-based Weakly-supervised Learning of Visual Relations with Graph Networks
Federico Baldassarre
Kevin Smith
Josephine Sullivan
Hossein Azizpour
215
25
0
16 Jun 2020
MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level
Amar Shrestha
Krittaphat Pugdeethosapol
Haowen Fang
Qinru Qiu
ObjD
123
2
0
06 Jun 2020
Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge
ACM Multimedia (ACM MM), 2020
Peng Wang
Dongyang Liu
Hui Li
Qi Wu
ObjD
196
22
0
02 Jun 2020
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao
Zhe Gan
Yu Cheng
Licheng Yu
Yen-Chun Chen
Jingjing Liu
VLM
248
137
0
15 May 2020
What is Learned in Visually Grounded Neural Syntax Acquisition
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Noriyuki Kojima
Hadar Averbuch-Elor
Alexander M. Rush
Yoav Artzi
186
22
0
04 May 2020
Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Arjun Reddy Akula
Spandana Gella
Yaser Al-Onaizan
Song-Chun Zhu
Siva Reddy
ObjD
143
55
0
04 May 2020
Pragmatic Issue-Sensitive Image Captioning
Findings (Findings), 2020
Allen Nie
Reuben Cohn-Gordon
Christopher Potts
143
24
0
29 Apr 2020
Deep Multimodal Neural Architecture Search
ACM Multimedia (ACM MM), 2020
Zhou Yu
Yuhao Cui
Jun-chen Yu
Meng Wang
Dacheng Tao
Qi Tian
153
107
0
25 Apr 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
J. S. Park
Chandra Bhagavatula
Roozbeh Mottaghi
Ali Farhadi
Yejin Choi
ReLM
LRM
151
6
0
22 Apr 2020
Graph-Structured Referring Expression Reasoning in The Wild
Computer Vision and Pattern Recognition (CVPR), 2020
Sibei Yang
Guanbin Li
Yizhou Yu
NAI
157
109
0
19 Apr 2020
Relation Transformer Network
Rajat Koner
Poulami Sinhamahapatra
Volker Tresp
ViT
287
35
0
13 Apr 2020
Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness
Hyunwoo J. Kim
Byeongchang Kim
Gunhee Kim
176
0
0
13 Apr 2020
Context-Aware Group Captioning via Self-Attention and Contrastive Features
Computer Vision and Pattern Recognition (CVPR), 2020
Zhuowan Li
Quan Hung Tran
Long Mai
Zhe Lin
Alan Yuille
VLM
159
50
0
07 Apr 2020
Evaluating Multimodal Representations on Visual Semantic Textual Similarity
European Conference on Artificial Intelligence (ECAI), 2020
Oier López de Lacalle
Ander Salaberria
Aitor Soroa Etxabe
Gorka Azkune
Eneko Agirre
143
3
0
04 Apr 2020
More Grounded Image Captioning by Distilling Image-Text Matching Model
Computer Vision and Pattern Recognition (CVPR), 2020
Yuanen Zhou
Meng Wang
Daqing Liu
Zhenzhen Hu
Hanwang Zhang
213
138
0
01 Apr 2020
Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters
.Ilker Kesen
Ozan Arkan Can
Erkut Erdem
Aykut Erdem
Deniz Yuret
VLM
145
2
0
28 Mar 2020
Grounded Situation Recognition
European Conference on Computer Vision (ECCV), 2020
Sarah M Pratt
Mark Yatskar
Luca Weihs
Ali Farhadi
Aniruddha Kembhavi
156
132
0
26 Mar 2020
Video Object Grounding using Semantic Roles in Language Description
Computer Vision and Pattern Recognition (CVPR), 2020
Arka Sadhu
Kan Chen
Ram Nevatia
177
50
0
24 Mar 2020
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Computer Vision and Pattern Recognition (CVPR), 2020
Gen Luo
Weihao Ye
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
406
344
0
19 Mar 2020
Giving Commands to a Self-driving Car: A Multimodal Reasoner for Visual Grounding
Thierry Deruyttere
Guillem Collell
Marie-Francine Moens
LRM
214
8
0
19 Mar 2020
MUTATT: Visual-Textual Mutual Guidance for Referring Expression Comprehension
IEEE International Conference on Multimedia and Expo (ICME), 2020
Shuai Wang
Fan Lyu
Wei Feng
Song Wang
ObjD
123
5
0
18 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
159
145
0
09 Mar 2020
Captioning Images with Novel Objects via Online Vocabulary Expansion
Mikihiro Tanaka
Tatsuya Harada
3DV
186
2
0
06 Mar 2020
Previous
1
2
3
...
14
15
16
17
18
19
Next