ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.08481
  4. Cited By
GuessWhat?! Visual object discovery through multi-modal dialogue
v1v2 (latest)

GuessWhat?! Visual object discovery through multi-modal dialogue

23 November 2016
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
    VLM
ArXiv (abs)PDFHTML

Papers citing "GuessWhat?! Visual object discovery through multi-modal dialogue"

50 / 237 papers shown
Title
An Empirical Study on the Generalization Power of Neural Representations
  Learned via Visual Guessing Games
An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing GamesConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Alessandro Suglia
Yonatan Bisk
Ioannis Konstas
Antonio Vergari
E. Bastianelli
Andrea Vanzo
Oliver Lemon
111
8
0
31 Jan 2021
Knowledge Grounded Conversational Symptom Detection with Graph Memory
  Networks
Knowledge Grounded Conversational Symptom Detection with Graph Memory NetworksClinical Natural Language Processing Workshop (ClinicalNLP), 2020
Hongyin Luo
Shang-Wen Li
James R. Glass
96
10
0
24 Jan 2021
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded
  Dialogue
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded DialogueAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Hung Le
Chinnadhurai Sankar
Seungwhan Moon
Ahmad Beirami
A. Geramifard
Satwik Kottur
VGen
231
23
0
01 Jan 2021
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual
  Contexts
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts
Yuxian Meng
Shuhe Wang
Qinghong Han
Xiaofei Sun
Leilei Gan
Rui Yan
Jiwei Li
353
31
0
30 Dec 2020
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework
  of Vision-and-Language BERTs
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTsTransactions of the Association for Computational Linguistics (TACL), 2020
Emanuele Bugliarello
Robert Bamler
Naoaki Okazaki
Desmond Elliott
212
125
0
30 Nov 2020
Where Are You? Localization from Embodied Dialog
Where Are You? Localization from Embodied DialogConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Meera Hahn
Jacob Krantz
Dhruv Batra
Devi Parikh
James M. Rehg
Stefan Lee
Peter Anderson
LM&Ro
182
33
0
16 Nov 2020
Deep Multimodal Fusion by Channel Exchanging
Deep Multimodal Fusion by Channel Exchanging
Yikai Wang
Wenbing Huang
Gang Hua
Qifeng Bai
Yu Rong
Junzhou Huang
262
276
0
10 Nov 2020
Refer, Reuse, Reduce: Generating Subsequent References in Visual and
  Conversational Contexts
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Ece Takmaz
Mario Giulianelli
Sandro Pezzelle
Arabella J. Sinclair
Raquel Fernández
124
30
0
09 Nov 2020
Imagining Grounded Conceptual Representations from Perceptual
  Information in Situated Guessing Games
Imagining Grounded Conceptual Representations from Perceptual Information in Situated Guessing Games
Alessandro Suglia
Antonio Vergari
Ioannis Konstas
Yonatan Bisk
E. Bastianelli
Andrea Vanzo
Oliver Lemon
OCL
149
11
0
05 Nov 2020
Reading Between the Lines: Exploring Infilling in Visual Narratives
Reading Between the Lines: Exploring Infilling in Visual NarrativesConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Khyathi Chandu
Ruo-Ping Dong
A. Black
114
4
0
26 Oct 2020
Does my multimodal model learn cross-modal interactions? It's harder to
  tell than you might think!
Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!
Jack Hessel
Lillian Lee
204
86
0
13 Oct 2020
A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial
  Expressions
A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions
Takuma Udagawa
T. Yamazaki
Akiko Aizawa
168
12
0
07 Oct 2020
Supervised Seeded Iterated Learning for Interactive Language Learning
Supervised Seeded Iterated Learning for Interactive Language Learning
Yuchen Lu
Soumye Singhal
Florian Strub
Olivier Pietquin
Aaron Courville
101
9
0
06 Oct 2020
Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue
Answer-Driven Visual State Estimator for Goal-Oriented Visual DialogueACM Multimedia (ACM MM), 2020
Zipeng Xu
Fangxiang Feng
Xiaojie Wang
Yushu Yang
Huixing Jiang
Zhongyuan Ouyang
141
7
0
01 Oct 2020
Multi-Task Learning with Deep Neural Networks: A Survey
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
422
711
0
10 Sep 2020
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
213
13
0
18 Aug 2020
Towards Ecologically Valid Research on Language User Interfaces
Towards Ecologically Valid Research on Language User Interfaces
H. D. Vries
Dzmitry Bahdanau
Christopher D. Manning
446
58
0
28 Jul 2020
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA DataNeural Information Processing Systems (NeurIPS), 2020
Michael Cogswell
Jiasen Lu
Rishabh Jain
Stefan Lee
Devi Parikh
Dhruv Batra
VLMEgoV
134
15
0
24 Jul 2020
Referring Expression Comprehension: A Survey of Methods and Datasets
Referring Expression Comprehension: A Survey of Methods and DatasetsIEEE transactions on multimedia (TMM), 2020
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
286
116
0
19 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal
  Shuffled Transformers
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Shiyang Feng
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Zelong Li
Jiaming Song
A. Cherian
206
11
0
08 Jul 2020
Dialog as a Vehicle for Lifelong Learning
Dialog as a Vehicle for Lifelong Learning
Aishwarya Padmakumar
Raymond J. Mooney
102
2
0
26 Jun 2020
Dialog Policy Learning for Joint Clarification and Active Learning
  Queries
Dialog Policy Learning for Joint Clarification and Active Learning Queries
Aishwarya Padmakumar
Raymond J. Mooney
234
11
0
09 Jun 2020
SIDU: Similarity Difference and Uniqueness Method for Explainable AI
SIDU: Similarity Difference and Uniqueness Method for Explainable AI
Satya M. Muddamsetty
M. N. Jahromi
T. Moeslund
67
15
0
04 Jun 2020
CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language
  Learning
CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Alessandro Suglia
Ioannis Konstas
Andrea Vanzo
E. Bastianelli
Desmond Elliott
Stella Frank
Oliver Lemon
131
17
0
03 Jun 2020
Give Me Something to Eat: Referring Expression Comprehension with
  Commonsense Knowledge
Give Me Something to Eat: Referring Expression Comprehension with Commonsense KnowledgeACM Multimedia (ACM MM), 2020
Peng Wang
Dongyang Liu
Hui Li
Qi Wu
ObjD
200
22
0
02 Jun 2020
Situated and Interactive Multimodal Conversations
Situated and Interactive Multimodal ConversationsInternational Conference on Computational Linguistics (COLING), 2020
Seungwhan Moon
Satwik Kottur
Paul A. Crook
Ankita De
Shivani Poddar
...
Daniel Difranco
Ahmad Beirami
Eunjoon Cho
R. Subba
A. Geramifard
197
74
0
02 Jun 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
270
748
0
10 May 2020
History for Visual Dialog: Do we really need it?
History for Visual Dialog: Do we really need it?
Shubham Agarwal
Trung Bui
Joon-Young Lee
Ioannis Konstas
Verena Rieser
VLM
113
73
0
08 May 2020
RMM: A Recursive Mental Model for Dialog Navigation
RMM: A Recursive Mental Model for Dialog NavigationFindings (Findings), 2020
Homero Roman Roman
Yonatan Bisk
Jesse Thomason
Asli Celikyilmaz
Jianfeng Gao
LM&RoLLMAG
211
49
0
02 May 2020
A Revised Generative Evaluation of Visual Dialogue
A Revised Generative Evaluation of Visual Dialogue
Daniela Massiceti
Viveka Kulharia
P. Dokania
N. Siddharth
Juil Sock
142
0
0
20 Apr 2020
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge
  Transfer
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge TransferConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Gi-Cheon Kang
Junseok Park
Hwaran Lee
Byoung-Tak Zhang
Jin-Hwa Kim
VLM
177
10
0
14 Apr 2020
Iterative Context-Aware Graph Inference for Visual Dialog
Iterative Context-Aware Graph Inference for Visual DialogComputer Vision and Pattern Recognition (CVPR), 2020
Dan Guo
Haibo Wang
Hanwang Zhang
Zhengjun Zha
Meng Wang
175
51
0
05 Apr 2020
AriEL: volume coding for sentence generation
AriEL: volume coding for sentence generation
Luca Herranz-Celotti
Simon Brodeur
Jean Rouat
130
0
0
30 Mar 2020
Grounded Situation Recognition
Grounded Situation RecognitionEuropean Conference on Computer Vision (ECCV), 2020
Sarah M Pratt
Mark Yatskar
Luca Weihs
Ali Farhadi
Aniruddha Kembhavi
156
132
0
26 Mar 2020
Visual Grounding in Video for Unsupervised Word Translation
Visual Grounding in Video for Unsupervised Word TranslationComputer Vision and Pattern Recognition (CVPR), 2020
Gunnar Sigurdsson
Jean-Baptiste Alayrac
Aida Nematzadeh
Lucas Smaira
Mateusz Malinowski
João Carreira
Phil Blunsom
Andrew Zisserman
VGen
230
51
0
11 Mar 2020
Guessing State Tracking for Visual Dialogue
Guessing State Tracking for Visual DialogueEuropean Conference on Computer Vision (ECCV), 2020
Wei Pang
Xiaojie Wang
OOD
306
10
0
24 Feb 2020
Deep Exemplar Networks for VQA and VQG
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
110
4
0
19 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art BaselineEuropean Conference on Computer Vision (ECCV), 2019
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
296
119
0
05 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1: Multi-Task Vision and Language Representation LearningComputer Vision and Pattern Recognition (CVPR), 2019
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLMObjD
283
499
0
05 Dec 2019
Efficient Attention Mechanism for Visual Dialog that can Handle All the
  Interactions between Multiple Inputs
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
251
7
0
26 Nov 2019
Two Causal Principles for Improving Visual Dialog
Two Causal Principles for Improving Visual DialogComputer Vision and Pattern Recognition (CVPR), 2019
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
514
158
0
24 Nov 2019
An Annotated Corpus of Reference Resolution for Interpreting Common
  Grounding
An Annotated Corpus of Reference Resolution for Interpreting Common GroundingAAAI Conference on Artificial Intelligence (AAAI), 2019
Takuma Udagawa
Akiko Aizawa
116
10
0
18 Nov 2019
Visual Dialogue State Tracking for Question Generation
Visual Dialogue State Tracking for Question GenerationAAAI Conference on Artificial Intelligence (AAAI), 2019
Wei Pang
Xiaojie Wang
146
34
0
12 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and ApplicationsIEEE Journal on Selected Topics in Signal Processing (JSTSP), 2019
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAIAI4TS
279
396
0
10 Nov 2019
Interactive Classification by Asking Informative Questions
Interactive Classification by Asking Informative QuestionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
L. Yu
Howard Chen
Sida Wang
Tao Lei
Yoav Artzi
174
28
0
09 Nov 2019
Language coverage and generalization in RNN-based continuous sentence
  embeddings for interacting agents
Language coverage and generalization in RNN-based continuous sentence embeddings for interacting agents
Luca Herranz-Celotti
Simon Brodeur
Jean Rouat
103
0
0
05 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRMReLM
288
10
0
31 Oct 2019
Dynamic Attention Networks for Task Oriented Grounding
Dynamic Attention Networks for Task Oriented Grounding
S. Dasgupta
Badri N. Patro
Vinay P. Namboodiri
150
1
0
14 Oct 2019
Improving Generative Visual Dialog by Answering Diverse Questions
Improving Generative Visual Dialog by Answering Diverse QuestionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Vishvak Murahari
Prithvijit Chattopadhyay
Dhruv Batra
Devi Parikh
Abhishek Das
143
38
0
23 Sep 2019
Probabilistic framework for solving Visual Dialog
Probabilistic framework for solving Visual DialogPattern Recognition (Pattern Recognit.), 2019
Badri N. Patro
Anupriy
Vinay P. Namboodiri
BDL
277
13
0
11 Sep 2019
Previous
12345
Next