Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1511.02283
Cited By
v1
v2
v3 (latest)
Generation and Comprehension of Unambiguous Object Descriptions
7 November 2015
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Github (164★)
Papers citing
"Generation and Comprehension of Unambiguous Object Descriptions"
50 / 917 papers shown
Title
Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension
Computer Vision and Pattern Recognition (CVPR), 2020
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
ObjD
213
77
0
01 Mar 2020
Guessing State Tracking for Visual Dialogue
European Conference on Computer Vision (ECCV), 2020
Wei Pang
Xiaojie Wang
OOD
306
10
0
24 Feb 2020
Dual Convolutional LSTM Network for Referring Image Segmentation
IEEE transactions on multimedia (TMM), 2020
Linwei Ye
Zhi Liu
Yang Wang
168
53
0
30 Jan 2020
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
Computer Vision and Pattern Recognition (CVPR), 2020
Zhu Zhang
Zhou Zhao
Yang Zhao
Qi. Wang
Huasheng Liu
Lianli Gao
228
144
0
19 Jan 2020
Adversarially Guided Self-Play for Adopting Social Conventions
Mycal Tucker
Yilun Zhou
J. Shah
145
16
0
16 Jan 2020
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
European Conference on Computer Vision (ECCV), 2019
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
345
495
0
18 Dec 2019
A Real-time Global Inference Network for One-stage Referring Expression Comprehension
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2019
Weihao Ye
Rongrong Ji
Gen Luo
Xiaoshuai Sun
Jinsong Su
Xinghao Ding
Chia-Wen Lin
Q. Tian
ObjD
158
77
0
07 Dec 2019
Connecting Vision and Language with Localized Narratives
European Conference on Computer Vision (ECCV), 2019
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
414
285
0
06 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2019
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLM
ObjD
283
499
0
05 Dec 2019
Tell-the-difference: Fine-grained Visual Descriptor via a Discriminating Referee
Shuangjie Xu
Feng Xu
Yu Cheng
Pan Zhou
71
2
0
14 Oct 2019
Referring Expression Object Segmentation with Caption-Aware Consistency
British Machine Vision Conference (BMVC), 2019
Yi-Wen Chen
Yi-Hsuan Tsai
Tiantian Wang
Yen-Yu Lin
Ming-Hsuan Yang
EgoV
148
95
0
10 Oct 2019
Talk2Car: Taking Control of Your Self-Driving Car
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Thierry Deruyttere
Simon Vandenhende
Dusan Grujicic
Luc Van Gool
Marie-Francine Moens
LM&Ro
165
163
0
24 Sep 2019
Dynamic Graph Attention for Referring Expression Comprehension
IEEE International Conference on Computer Vision (ICCV), 2019
Sibei Yang
Guanbin Li
Yizhou Yu
OCL
166
245
0
18 Sep 2019
Communication-based Evaluation for Natural Language Generation
Benjamin Newman
Reuben Cohn-Gordon
Christopher Potts
133
7
0
16 Sep 2019
Scene Graph Parsing by Attention Graph
Martin Andrews
Yew Ken Chia
Sam Witteveen
GNN
88
12
0
13 Sep 2019
Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction
AAAI Conference on Artificial Intelligence (AAAI), 2019
Jingwen Wang
Lin Ma
Wenhao Jiang
209
200
0
11 Sep 2019
Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding
ACM Multimedia (ACM MM), 2019
Xuejing Liu
Liang Li
Shuhui Wang
Zhengjun Zha
Li Su
Qingming Huang
160
66
0
05 Sep 2019
What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Xintong Yu
Hongming Zhang
Yangqiu Song
Yan Song
Changshui Zhang
90
33
0
01 Sep 2019
Aesthetic Image Captioning From Weakly-Labelled Photographs
Koustav Ghosal
A. Rana
A. Smolic
174
28
0
29 Aug 2019
Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
IEEE International Conference on Computer Vision (ICCV), 2019
Xuejing Liu
Liang Li
Shuhui Wang
Zhengjun Zha
Dechao Meng
Qingming Huang
ObjD
138
90
0
28 Aug 2019
Phrase Localization Without Paired Training Examples
IEEE International Conference on Computer Vision (ICCV), 2019
Josiah Wang
Lucia Specia
114
49
0
20 Aug 2019
Zero-Shot Grounding of Objects from Natural Language Queries
IEEE International Conference on Computer Vision (ICCV), 2019
Arka Sadhu
Kan Chen
Ram Nevatia
ObjD
207
171
0
20 Aug 2019
A Fast and Accurate One-Stage Approach to Visual Grounding
IEEE International Conference on Computer Vision (ICCV), 2019
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
192
423
0
18 Aug 2019
Multimodal Unified Attention Networks for Vision-and-Language Interactions
Zhou Yu
Yuhao Cui
Jun Yu
Dacheng Tao
Q. Tian
229
44
0
12 Aug 2019
Transferable Representation Learning in Vision-and-Language Navigation
IEEE International Conference on Computer Vision (ICCV), 2019
Haoshuo Huang
Vihan Jain
Harsh Mehta
Alexander Ku
Gabriel Ilharco
Jason Baldridge
Eugene Ie
LM&Ro
181
92
0
09 Aug 2019
Searching for Ambiguous Objects in Videos using Relational Referring Expressions
British Machine Vision Conference (BMVC), 2019
Hazan Anayurt
Sezai Artun Ozyegin
Ulfet Cetin
Utku Aktaş
Sinan Kalkan
250
9
0
03 Aug 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Journal of Artificial Intelligence Research (JAIR), 2019
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
376
141
0
22 Jul 2019
MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment
N. Ilinykh
Sina Zarrieß
David Schlangen
105
44
0
11 Jul 2019
Aesthetic Attributes Assessment of Images
ACM Multimedia (ACM MM), 2019
Xin Jin
Le Wu
Geng Zhao
Xiaodong Li
Xiaokun Zhang
Shiming Ge
Dongqing Zou
Bin Zhou
Xinghui Zhou
114
48
0
11 Jul 2019
Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Yulei Niu
Hanwang Zhang
Zhiwu Lu
Shih-Fu Chang
ObjD
BDL
164
31
0
08 Jul 2019
Video Question Generation via Cross-Modal Self-Attention Networks Learning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Yu-Siang Wang
Hung-Ting Su
Chen-Hsi Chang
Zhe-Yu Liu
Winston H. Hsu
151
12
0
05 Jul 2019
Expressing Visual Relationships via Language
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Hao Tan
Franck Dernoncourt
Zhe Lin
Trung Bui
Joey Tianyi Zhou
208
75
0
18 Jun 2019
Unsupervised Video Interpolation Using Cycle Consistency
IEEE International Conference on Computer Vision (ICCV), 2019
F. Reda
Deqing Sun
Aysegül Dündar
Mohammad Shoeybi
Guilin Liu
Kevin J. Shih
Andrew Tao
Jan Kautz
Bryan Catanzaro
237
92
0
13 Jun 2019
Know What You Don't Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Sina Zarrieß
David Schlangen
96
18
0
13 Jun 2019
Relationship-Embedded Representation Learning for Grounding Referring Expressions
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Sibei Yang
Guanbin Li
Yizhou Yu
ObjD
190
67
0
11 Jun 2019
Joint Visual Grounding with Language Scene Graphs
Daqing Liu
Hanwang Zhang
Zhengjun Zha
Meng Wang
Qianru Sun
162
6
0
09 Jun 2019
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2019
Zhu Zhang
Zhijie Lin
Zhou Zhao
Zhenxin Xiao
187
226
0
06 Jun 2019
Learning to Compose and Reason with Language Tree Structures for Visual Grounding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Richang Hong
Daqing Liu
Xiaoyu Mo
Xiangnan He
Hanwang Zhang
ReLM
LRM
191
194
0
05 Jun 2019
Natural Vocabulary Emerges from Free-Form Annotations
Jordi Pont-Tuset
Michael Gygli
V. Ferrari
VLM
176
3
0
04 Jun 2019
The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
J. Haber
Tim Baumgärtner
Ece Takmaz
Lieke Gelderloos
Elia Bruni
Raquel Fernández
152
84
0
04 Jun 2019
Language-Conditioned Graph Networks for Relational Reasoning
IEEE International Conference on Computer Vision (ICCV), 2019
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
178
182
0
10 May 2019
ShapeGlot: Learning Language for Shape Differentiation
IEEE International Conference on Computer Vision (ICCV), 2019
Panos Achlioptas
Judy Fan
Robert D. Hawkins
Noah D. Goodman
Leonidas Guibas
222
86
0
08 May 2019
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi
Qi Wu
Peter Anderson
Xinze Wang
Wenjie Wang
Chunhua Shen
Anton Van Den Hengel
LM&Ro
298
416
0
23 Apr 2019
Tripping through time: Efficient Localization of Activities in Videos
Meera Hahn
Asim Kadav
James M. Rehg
H. Graf
559
90
0
22 Apr 2019
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents
Jack Hessel
Lillian Lee
David M. Mimno
153
31
0
16 Apr 2019
Natural Language Semantics With Pictures: Some Language & Vision Datasets and Potential Uses for Computational Semantics
David Schlangen
110
6
0
15 Apr 2019
Learning to Generate Unambiguous Spatial Referring Expressions for Real-World Environments
Fethiye Irmak Dogan
Sinan Kalkan
Iolanda Leite
201
19
0
15 Apr 2019
Cross-Modal Self-Attention Network for Referring Image Segmentation
Linwei Ye
Mrigank Rochan
Zhi Liu
Yang Wang
EgoV
198
535
0
09 Apr 2019
Revisiting EmbodiedQA: A Simple Baseline and Beyond
Yuehua Wu
Lu Jiang
Yi Yang
LM&Ro
171
33
0
08 Apr 2019
Referring to Objects in Videos using Spatio-Temporal Identifying Descriptions
Peratham Wiriyathammabhum
Abhinav Shrivastava
Vlad I. Morariu
L. Davis
100
5
0
08 Apr 2019
Previous
1
2
3
...
15
16
17
18
19
Next