Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.08481
Cited By
GuessWhat?! Visual object discovery through multi-modal dialogue
23 November 2016
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GuessWhat?! Visual object discovery through multi-modal dialogue"
50 / 232 papers shown
Title
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
148
0
0
11 Mar 2025
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
55
3
0
31 Dec 2024
Multi-Modal Dialogue State Tracking for Playing GuessWhich Game
Wei Pang
Ruixue Duan
Jinfu Yang
Ning Li
33
0
0
15 Aug 2024
Enhancing Visual Dialog State Tracking through Iterative Object-Entity Alignment in Multi-Round Conversations
Wei Pang
Ruixue Duan
Jinfu Yang
Ning Li
19
0
0
13 Aug 2024
ActionVOS: Actions as Prompts for Video Object Segmentation
Liangyang Ouyang
Ruicong Liu
Yifei Huang
Ryosuke Furuta
Yoichi Sato
VOS
33
2
0
10 Jul 2024
Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models
Jierun Chen
Fangyun Wei
Jinjing Zhao
Sizhe Song
Bohuai Wu
Zhuoxuan Peng
S.-H. Gary Chan
Hongyang R. Zhang
33
8
0
24 Jun 2024
ChatShop: Interactive Information Seeking with Language Agents
Sanxing Chen
Sam Wiseman
Bhuwan Dhingra
KELM
26
7
0
15 Apr 2024
How Far Are We from Intelligent Visual Deductive Reasoning?
Yizhe Zhang
Richard He Bai
Ruixiang Zhang
Jiatao Gu
Shuangfei Zhai
J. Susskind
Navdeep Jaitly
ReLM
LRM
44
13
0
07 Mar 2024
CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments
Savitha Sam Abraham
Marjan Alirezaie
Luc de Raedt
22
1
0
05 Mar 2024
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Yichi Zhang
Ziqiao Ma
Xiaofeng Gao
Suhaila Shakiah
Qiaozi Gao
Joyce Chai
MLLM
VLM
42
39
0
26 Feb 2024
SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot Interaction
Jie Xu
Hanbo Zhang
Xinghang Li
Huaping Liu
Xuguang Lan
Tao Kong
LM&Ro
32
3
0
19 Feb 2024
Improving Agent Interactions in Virtual Environments with Language Models
Jack Zhang
LLMAG
24
0
0
08 Feb 2024
Towards Unified Interactive Visual Grounding in The Wild
Jie Xu
Hanbo Zhang
Qingyi Si
Yifeng Li
Xuguang Lan
Tao Kong
LM&Ro
30
5
0
30 Jan 2024
Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding
Chancharik Mitra
Abrar Anwar
Rodolfo Corona
Dan Klein
Trevor Darrell
Jesse Thomason
19
1
0
12 Nov 2023
Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter
Georgios Tziafas
Yucheng Xu
Arushi Goel
M. Kasaei
Zhibin Li
H. Kasaei
32
23
0
09 Nov 2023
Context Does Matter: End-to-end Panoptic Narrative Grounding with Deformable Attention Refined Matching Network
Yiming Lin
Xiao-Bo Jin
Qiufeng Wang
Kaizhu Huang
24
3
0
25 Oct 2023
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions
Hanbo Zhang
Jie Xu
Yuchen Mo
Tao Kong
17
1
0
18 Oct 2023
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games
Yizhe Zhang
Jiarui Lu
Navdeep Jaitly
LRM
ELM
16
9
0
02 Oct 2023
Resolving References in Visually-Grounded Dialogue via Text Generation
Bram Willemsen
Livia Qian
Gabriel Skantze
17
3
0
23 Sep 2023
Pointing out Human Answer Mistakes in a Goal-Oriented Visual Dialogue
Ryosuke Oshima
Seitaro Shinagawa
Hideki Tsunashima
Qi Feng
Shigeo Morishima
27
3
0
19 Sep 2023
PROGrasp: Pragmatic Human-Robot Communication for Object Grasping
Gi-Cheon Kang
Junghyun Kim
Jaein Kim
Byoung-Tak Zhang
19
4
0
14 Sep 2023
Collecting Visually-Grounded Dialogue with A Game Of Sorts
Bram Willemsen
Dmytro Kalpakchi
Gabriel Skantze
11
2
0
10 Sep 2023
Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations
Kilichbek Haydarov
Xiaoqian Shen
Avinash Madasu
Mahmoud Salem
Jia Li
Gamaleldin F. Elsayed
Mohamed Elhoseiny
31
4
0
30 Aug 2023
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
46
10
0
28 Aug 2023
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor Scenes
Yuhao Lu
Yixuan Fan
Beixing Deng
F. Liu
Yali Li
Shengjin Wang
33
28
0
01 Aug 2023
'What are you referring to?' Evaluating the Ability of Multi-Modal Dialogue Models to Process Clarificational Exchanges
Javier Chiyah-Garcia
Alessandro Suglia
Arash Eshghi
Helen F. Hastie
24
6
0
28 Jul 2023
Learning to Generate Equitable Text in Dialogue from Biased Training Data
Anthony Sicilia
Malihe Alikhani
40
15
0
10 Jul 2023
Solving Dialogue Grounding Embodied Task in a Simulated Environment using Further Masked Language Modeling
Weijie Zhang
27
0
0
21 Jun 2023
Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Shih-Lun Wu
Yi-Hui Chou
Liang Li
13
0
0
16 Jun 2023
Dealing with Semantic Underspecification in Multimodal NLP
Sandro Pezzelle
14
9
0
08 Jun 2023
VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions
Yuxuan Wang
Zilong Zheng
Xueliang Zhao
Jinpeng Li
Yueqian Wang
Dongyan Zhao
VGen
24
9
0
30 May 2023
A Unified Framework for Slot based Response Generation in a Multimodal Dialogue System
Mauajama Firdaus
Avinash Madasu
Asif Ekbal
38
7
0
27 May 2023
ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue
Haoqin Tu
Yitong Li
Fei Mi
Zhongliang Yang
35
4
0
23 May 2023
WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language
Zhe-nan Lin
Xidong Peng
Peishan Cong
Ge Zheng
Yujin Sun
Yuenan Hou
Xinge Zhu
Sibei Yang
Yuexin Ma
VGen
82
4
0
12 Apr 2023
ScanERU: Interactive 3D Visual Grounding based on Embodied Reference Understanding
Ziyang Lu
Yunqiang Pei
Guoqing Wang
Yang Yang
Zheng Wang
Heng Tao Shen
46
6
0
23 Mar 2023
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
Deyao Zhu
Jun Chen
Kilichbek Haydarov
Xiaoqian Shen
Wenxuan Zhang
Mohamed Elhoseiny
MLLM
27
96
0
12 Mar 2023
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World
Hongpeng Lin
Ludan Ruan
Wenke Xia
Peiyu Liu
Jing Wen
...
Di Hu
Ruihua Song
Wayne Xin Zhao
Qin Jin
Zhiwu Lu
VGen
27
9
0
14 Jan 2023
SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph
Yuxing Long
Binyuan Hui
Fulong Ye
Yanyang Li
Zhuoxin Han
Caixia Yuan
Yongbin Li
Xiaojie Wang
LLMAG
25
7
0
05 Jan 2023
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges
R. Zakari
Jim Wilson Owusu
Hailin Wang
Ke Qin
Zaharaddeen Karami Lawal
Yue-hong Dong
LRM
31
16
0
26 Dec 2022
A survey on knowledge-enhanced multimodal learning
Maria Lymperaiou
Giorgos Stamou
35
13
0
19 Nov 2022
Navigating Connected Memories with a Task-oriented Dialog System
Seungwhan Moon
Satwik Kottur
A. Geramifard
Babak Damavandi
35
2
0
15 Nov 2022
Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling Approaches
Daniel Fried
Nicholas Tomlin
Jennifer Hu
Roma Patel
Aida Nematzadeh
19
6
0
15 Nov 2022
Towards Unifying Reference Expression Generation and Comprehension
Duo Zheng
Tao Kong
Ya Jing
Jiaan Wang
Xiaojie Wang
ObjD
27
6
0
24 Oct 2022
Are Current Decoding Strategies Capable of Facing the Challenges of Visual Dialogue?
Amit Kumar Chaudhary
Alex J. Lucassen
Ioanna Tsani
A. Testoni
6
1
0
24 Oct 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
66
106
0
23 Oct 2022
LEATHER: A Framework for Learning to Generate Human-like Text in Dialogue
Anthony Sicilia
Malihe Alikhani
39
4
0
14 Oct 2022
Understanding Embodied Reference with Touch-Line Transformer
Y. Li
Xiaoxue Chen
Hao Zhao
Jiangtao Gong
Guyue Zhou
Federico Rossano
Yixin Zhu
158
15
0
11 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
33
16
0
05 Oct 2022
Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks
Tianwei Chen
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
Hajime Nagahara
VLM
27
0
0
23 Aug 2022
Modeling Non-Cooperative Dialogue: Theoretical and Empirical Insights
Anthony Sicilia
Tristan D. Maidment
Pat Healy
Malihe Alikhani
12
3
0
15 Jul 2022
1
2
3
4
5
Next