Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1511.02283
Cited By
v1
v2
v3 (latest)
Generation and Comprehension of Unambiguous Object Descriptions
7 November 2015
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Github (164★)
Papers citing
"Generation and Comprehension of Unambiguous Object Descriptions"
50 / 917 papers shown
Title
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
A. Piergiovanni
Wei Li
Weicheng Kuo
M. Saffar
Fred Bertsch
A. Angelova
235
18
0
02 May 2022
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
Computer Vision and Pattern Recognition (CVPR), 2022
Li Yang
Yan Xu
Chunfen Yuan
Wei Liu
Bing Li
Weiming Hu
ObjD
253
152
0
30 Apr 2022
GRIT: General Robust Image Task Benchmark
Tanmay Gupta
Ryan Marten
Aniruddha Kembhavi
Derek Hoiem
VLM
OOD
ObjD
137
35
0
28 Apr 2022
Instance-Specific Feature Propagation for Referring Segmentation
IEEE transactions on multimedia (IEEE TMM), 2022
Chang Liu
Xudong Jiang
Henghui Ding
ISeg
146
66
0
26 Apr 2022
The 6th AI City Challenge
M. Naphade
Shuo Wang
D. Anastasiu
Zheng Tang
Ming-Ching Chang
...
Stan Sclaroff
Pranamesh Chakraborty
Alice Li
Shangru Li
Rama Chellappa
286
74
0
21 Apr 2022
Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension
IEEE Transactions on Image Processing (IEEE TIP), 2022
Peihan Miao
Wei Su
Gaoang Wang
Xuewei Li
Xi Li
ObjD
262
13
0
21 Apr 2022
Making the Most of Text Semantics to Improve Biomedical Vision--Language Processing
European Conference on Computer Vision (ECCV), 2022
Benedikt Boecking
Naoto Usuyama
Shruthi Bannur
Daniel Coelho De Castro
Anton Schwaighofer
...
Tristan Naumann
A. Nori
Javier Alvarez-Valle
Hoifung Poon
Ozan Oktay
446
341
0
21 Apr 2022
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension
IEEE transactions on multimedia (IEEE TMM), 2022
Gen Luo
Weihao Ye
Jiamu Sun
Xiaoshuai Sun
Rongrong Ji
ObjD
195
13
0
17 Apr 2022
It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection
Computer Vision and Pattern Recognition (CVPR), 2022
Youssef Mohamed
Faizan Farooq Khan
Kilichbek Haydarov
Mohamed Elhoseiny
115
43
0
15 Apr 2022
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Sanjay Subramanian
William Merrill
Trevor Darrell
Matt Gardner
Sameer Singh
Anna Rohrbach
ObjD
231
151
0
12 Apr 2022
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
European Conference on Computer Vision (ECCV), 2022
Zhaowei Cai
Gukyeong Kwon
Avinash Ravichandran
Erhan Bas
Zhuowen Tu
Rahul Bhotika
Stefano Soatto
ObjD
MLLM
VLM
140
51
0
12 Apr 2022
FindIt: Generalized Localization with Natural Language Queries
European Conference on Computer Vision (ECCV), 2022
Weicheng Kuo
Fred Bertsch
Wei Li
A. Piergiovanni
M. Saffar
A. Angelova
ObjD
190
18
0
31 Mar 2022
ReSTR: Convolution-free Referring Image Segmentation Using Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
N. Kim
Dongwon Kim
Cuiling Lan
Wenjun Zeng
Suha Kwak
291
176
0
31 Mar 2022
Image Retrieval from Contextual Descriptions
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Benno Krojer
Vaibhav Adlakha
Vibhav Vineet
Yash Goyal
Edoardo Ponti
Siva Reddy
212
37
0
29 Mar 2022
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding
Computer Vision and Pattern Recognition (CVPR), 2022
Jiabo Ye
Junfeng Tian
Ming Yan
Xiaoshan Yang
Xuwu Wang
Ji Zhang
Liang He
Xin Lin
ObjD
199
91
0
29 Mar 2022
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
Computer Vision and Pattern Recognition (CVPR), 2022
Manuel Kolmet
Qunjie Zhou
Aljosa Osep
Laura Leal-Taixe
229
39
0
28 Mar 2022
Emergence of hierarchical reference systems in multi-agent communication
International Conference on Computational Linguistics (COLING), 2022
Xenia Ohmer
M. Duda
Elia Bruni
258
10
0
24 Mar 2022
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
284
97
0
18 Mar 2022
Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
Computer Vision and Pattern Recognition (CVPR), 2022
Haojun Jiang
Yuanze Lin
Dongchen Han
Shiji Song
Gao Huang
ObjD
264
64
0
16 Mar 2022
Non-neural Models Matter: A Re-evaluation of Neural Referring Expression Generation Systems
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
F. Same
Guanyi Chen
Kees van Deemter
136
12
0
15 Mar 2022
Grounding Commands for Autonomous Vehicles via Layer Fusion with Region-specific Dynamic Layer Attention
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Hou Pong Chan
M. Guo
Chengguang Xu
195
7
0
14 Mar 2022
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension
Fuhai Chen
Xuri Ge
Xiaoshuai Sun
Yue Gao
Jianzhuang Liu
Feiyue Huang
Rongrong Ji
163
0
0
12 Mar 2022
Suspected Object Matters: Rethinking Model's Prediction for One-stage Visual Grounding
ACM Multimedia (ACM MM), 2022
Yang Jiao
Zequn Jie
Yue Yu
Lin Ma
Yu-Gang Jiang
OOD
167
9
0
10 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
IEEE transactions on multimedia (IEEE TMM), 2022
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
186
45
0
06 Mar 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
Shixuan Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TS
VLM
189
40
0
03 Mar 2022
Phrase-Based Affordance Detection via Cyclic Bilateral Interaction
IEEE Transactions on Artificial Intelligence (IEEE TAI), 2022
Liangsheng Lu
Wei Zhai
Hongcheng Luo
Yu Kang
Yang Cao
229
26
0
24 Feb 2022
CAISE: Conversational Agent for Image Search and Editing
AAAI Conference on Artificial Intelligence (AAAI), 2022
Hyounghun Kim
Doo Soon Kim
Seunghyun Yoon
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
204
6
0
24 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
International Conference on Machine Learning (ICML), 2022
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
478
992
0
07 Feb 2022
Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching
Neurocomputing (Neurocomputing), 2022
Hengcan Shi
Munawar Hayat
Jianfei Cai
ObjD
173
12
0
18 Jan 2022
Language as Queries for Referring Video Object Segmentation
Computer Vision and Pattern Recognition (CVPR), 2022
Jiannan Wu
Yi Jiang
Pei Sun
Zehuan Yuan
Ping Luo
456
214
0
03 Jan 2022
Deconfounded Visual Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2021
Jianqiang Huang
Yu Qin
Jiaxin Qi
Qianru Sun
Hanwang Zhang
CML
ObjD
171
37
0
31 Dec 2021
Image Segmentation Using Text and Image Prompts
Computer Vision and Pattern Recognition (CVPR), 2021
Timo Lüddecke
Alexander S. Ecker
CLIP
VLM
661
631
0
18 Dec 2021
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain
N. Gkanatsios
Ishita Mediratta
Katerina Fragkiadaki
ObjD
417
145
0
16 Dec 2021
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering
Junbin Xiao
Angela Yao
Zhiyuan Liu
Yicong Li
Wei Ji
Tat-Seng Chua
304
135
0
12 Dec 2021
From Coarse to Fine-grained Concept based Discrimination for Phrase Detection
Maan Qraitem
Bryan A. Plummer
ObjD
179
0
0
06 Dec 2021
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
Computer Vision and Pattern Recognition (CVPR), 2021
Zhao Yang
Yuan Liu
Yansong Tang
Kai-xiang Chen
Hengshuang Zhao
Juil Sock
729
416
0
04 Dec 2021
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Dave Zhenyu Chen
Qirui Wu
Matthias Nießner
Angel X. Chang
168
49
0
02 Dec 2021
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Adam Botach
Evgenii Zheltonozhskii
Chaim Baskin
VOS
276
198
0
29 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
313
131
0
23 Nov 2021
MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation
Zizhang Li
Mengmeng Wang
Jianbiao Mei
Yong Liu
185
19
0
21 Nov 2021
Evaluating and Improving Interactions with Hazy Oracles
Stephan J. Lemmer
Jason J. Corso
152
2
0
19 Oct 2021
Towards Language-guided Visual Recognition via Dynamic Convolutions
Gen Luo
Weihao Ye
Xiaoshuai Sun
Yongjian Wu
Yue Gao
Rongrong Ji
ObjD
212
25
0
17 Oct 2021
Two-stage Visual Cues Enhancement Network for Referring Image Segmentation
ACM Multimedia (ACM MM), 2021
Yang Jiao
Zequn Jie
Weixin Luo
Yue Yu
Yu-Gang Jiang
Xiaolin K. Wei
Lin Ma
335
29
0
09 Oct 2021
When in Doubt: Improving Classification Performance with Alternating Normalization
Menglin Jia
A. Reiter
Ser-Nam Lim
Yoav Artzi
Claire Cardie
137
13
0
28 Sep 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
540
243
0
24 Sep 2021
ReaSCAN: Compositional Reasoning in Language Grounding
Zhengxuan Wu
Elisa Kreiss
Desmond C. Ong
Christopher Potts
CoGe
LRM
173
24
0
18 Sep 2021
Reference-Centric Models for Grounded Collaborative Dialogue
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Daniel Fried
Justin T. Chiu
Dan Klein
167
22
0
10 Sep 2021
Panoptic Narrative Grounding
IEEE International Conference on Computer Vision (ICCV), 2021
Cristina González
Nicolás Ayobi
Isabela Hernández
José Hernández
Jordi Pont-Tuset
Pablo Arbeláez
232
27
0
10 Sep 2021
YouRefIt: Embodied Reference Understanding with Language and Gesture
IEEE International Conference on Computer Vision (ICCV), 2021
Yixin Chen
Qing Li
Deqian Kong
Yik Lun Kei
Song-Chun Zhu
Tao Gao
Yixin Zhu
Siyuan Huang
LM&Ro
203
48
0
08 Sep 2021
Who's Waldo? Linking People Across Text and Images
Claire Yuqing Cui
Apoorv Khandelwal
Yoav Artzi
Noah Snavely
Hadar Averbuch-Elor
185
21
0
16 Aug 2021
Previous
1
2
3
...
12
13
14
...
17
18
19
Next