ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.08669
  4. Cited By
Visual Dialog
v1v2v3v4v5 (latest)

Visual Dialog

26 November 2016
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
ArXiv (abs)PDFHTML

Papers citing "Visual Dialog"

50 / 597 papers shown
Beyond VQA: Generating Multi-word Answer and Rationale to Visual
  Questions
Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions
Radhika Dua
Sai Srinivas Kancheti
V. Balasubramanian
LRM
266
27
0
24 Oct 2020
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
  Functional Entropies
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional EntropiesNeural Information Processing Systems (NeurIPS), 2020
Itai Gat
Idan Schwartz
Alex Schwing
Tamir Hazan
261
100
0
21 Oct 2020
Multimodal Research in Vision and Language: A Review of Current and
  Emerging Trends
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
277
6
0
19 Oct 2020
Answer-checking in Context: A Multi-modal FullyAttention Network for
  Visual Question Answering
Answer-checking in Context: A Multi-modal FullyAttention Network for Visual Question AnsweringInternational Conference on Pattern Recognition (ICPR), 2020
Hantao Huang
Tao Han
Wei Han
D. Yap
Cheng-Ming Chiang
136
4
0
17 Oct 2020
A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial
  Expressions
A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions
Takuma Udagawa
T. Yamazaki
Akiko Aizawa
224
12
0
07 Oct 2020
Multi-Modal Open-Domain Dialogue
Multi-Modal Open-Domain DialogueConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Kurt Shuster
Eric Michael Smith
Da Ju
Jason Weston
AI4CE
287
48
0
02 Oct 2020
Likelihood Landscapes: A Unifying Principle Behind Many Adversarial
  Defenses
Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses
Fu-Huei Lin
Rohit Mittapalli
Prithvijit Chattopadhyay
Daniel Bolya
Judy Hoffman
AAML
156
2
0
25 Aug 2020
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
264
13
0
18 Aug 2020
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning
  in Visual Dialogue
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual DialogueACM Multimedia (ACM MM), 2020
X. Jiang
Siyi Du
Zengchang Qin
Yajing Sun
Jiahao Yu
269
39
0
11 Aug 2020
SeqDialN: Sequential Visual Dialog Networks in Joint Visual-Linguistic Representation SpaceWorkshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc), 2020
Liu Yang
VLM
179
5
0
02 Aug 2020
Towards Ecologically Valid Research on Language User Interfaces
Towards Ecologically Valid Research on Language User Interfaces
H. D. Vries
Dzmitry Bahdanau
Christopher D. Manning
468
59
0
28 Jul 2020
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA DataNeural Information Processing Systems (NeurIPS), 2020
Michael Cogswell
Jiasen Lu
Rishabh Jain
Stefan Lee
Devi Parikh
Dhruv Batra
VLMEgoV
141
15
0
24 Jul 2020
Active Visual Information Gathering for Vision-Language Navigation
Active Visual Information Gathering for Vision-Language NavigationEuropean Conference on Computer Vision (ECCV), 2020
Hanqing Wang
Wenguan Wang
Tianmin Shu
Wei Liang
Jianbing Shen
278
82
0
15 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal
  Shuffled Transformers
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Shiyang Feng
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Zelong Li
Jiaming Song
A. Cherian
235
11
0
08 Jul 2020
DAM: Deliberation, Abandon and Memory Networks for Generating Detailed
  and Non-repetitive Responses in Visual Dialogue
DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue
X. Jiang
Jiahao Yu
Yajing Sun
Zengchang Qin
Zihao Zhu
Yue Hu
Qi Wu
MLLM
261
19
0
07 Jul 2020
Comprehensive Information Integration Modeling Framework for Video
  Titling
Comprehensive Information Integration Modeling Framework for Video TitlingKnowledge Discovery and Data Mining (KDD), 2020
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Tan Jiang
Jingren Zhou
Hongxia Yang
Leilei Gan
171
41
0
24 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and
  Future Directions
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Stephen Roller
Y-Lan Boureau
Jason Weston
Antoine Bordes
Emily Dinan
...
Kurt Shuster
Eric Michael Smith
Arthur Szlam
Jack Urbanek
Mary Williamson
LLMAGAI4CE
231
60
0
22 Jun 2020
ORD: Object Relationship Discovery for Visual Dialogue Generation
ORD: Object Relationship Discovery for Visual Dialogue Generation
Ziwei Wang
Zi Huang
Yadan Luo
Huimin Lu
186
4
0
15 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented
  Agents: Research Directions and Challenges
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAGAI4TSAI4CE
228
9
0
10 Jun 2020
Counterfactual VQA: A Cause-Effect Look at Language Bias
Counterfactual VQA: A Cause-Effect Look at Language Bias
Yulei Niu
Kaihua Tang
Hanwang Zhang
Zhiwu Lu
Xiansheng Hua
Ji-Rong Wen
CML
537
478
0
08 Jun 2020
Situated and Interactive Multimodal Conversations
Situated and Interactive Multimodal ConversationsInternational Conference on Computational Linguistics (COLING), 2020
Seungwhan Moon
Satwik Kottur
Paul A. Crook
Ankita De
Shivani Poddar
...
Daniel Difranco
Ahmad Beirami
Eunjoon Cho
R. Subba
A. Geramifard
224
74
0
02 Jun 2020
Probing Emergent Semantics in Predictive Agents via Question Answering
Probing Emergent Semantics in Predictive Agents via Question AnsweringInternational Conference on Machine Learning (ICML), 2020
Abhishek Das
Federico Carnevale
Hamza Merzic
Laura Rimell
R. Schneider
...
Alden Hung
Arun Ahuja
S. Clark
Greg Wayne
Felix Hill
226
18
0
01 Jun 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
333
763
0
10 May 2020
History for Visual Dialog: Do we really need it?
History for Visual Dialog: Do we really need it?
Shubham Agarwal
Trung Bui
Joon-Young Lee
Ioannis Konstas
Verena Rieser
VLM
133
74
0
08 May 2020
RMM: A Recursive Mental Model for Dialog Navigation
RMM: A Recursive Mental Model for Dialog NavigationFindings (Findings), 2020
Homero Roman Roman
Yonatan Bisk
Jesse Thomason
Asli Celikyilmaz
Jianfeng Gao
LM&RoLLMAG
219
50
0
02 May 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
VD-BERT: A Unified Vision and Dialog Transformer with BERTConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Yue Wang
Shafiq Joty
Michael R. Lyu
Irwin King
Caiming Xiong
Guosheng Lin
378
107
0
28 Apr 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
J. S. Park
Chandra Bhagavatula
Roozbeh Mottaghi
Ali Farhadi
Yejin Choi
ReLMLRM
216
6
0
22 Apr 2020
A Revised Generative Evaluation of Visual Dialogue
A Revised Generative Evaluation of Visual Dialogue
Daniela Massiceti
Viveka Kulharia
P. Dokania
N. Siddharth
Juil Sock
166
0
0
20 Apr 2020
Learning What Makes a Difference from Counterfactual Examples and
  Gradient Supervision
Learning What Makes a Difference from Counterfactual Examples and Gradient SupervisionEuropean Conference on Computer Vision (ECCV), 2020
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OODSSLCML
223
125
0
20 Apr 2020
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge
  Transfer
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge TransferConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Gi-Cheon Kang
Junseok Park
Hwaran Lee
Byoung-Tak Zhang
Jin-Hwa Kim
VLM
203
10
0
14 Apr 2020
An Entropy Clustering Approach for Assessing Visual Question Difficulty
An Entropy Clustering Approach for Assessing Visual Question DifficultyIEEE Access (IEEE Access), 2020
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
Shuníchi Satoh
OODAAML
304
1
0
12 Apr 2020
Rephrasing visual questions by specifying the entropy of the answer
  distribution
Rephrasing visual questions by specifying the entropy of the answer distribution
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
S. Satoh
OOD
156
2
0
10 Apr 2020
Iterative Context-Aware Graph Inference for Visual Dialog
Iterative Context-Aware Graph Inference for Visual DialogComputer Vision and Pattern Recognition (CVPR), 2020
Dan Guo
Haibo Wang
Hanwang Zhang
Zhengjun Zha
Meng Wang
219
52
0
05 Apr 2020
Open Domain Dialogue Generation with Latent Images
Open Domain Dialogue Generation with Latent ImagesAAAI Conference on Artificial Intelligence (AAAI), 2020
Ze Yang
Wei Wu
Huang Hu
Can Xu
Wei Wang
Zhoujun Li
190
30
0
04 Apr 2020
DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style
  Word Generator
DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style Word Generator
Hwanhee Lee
Seunghyun Yoon
Franck Dernoncourt
Doo Soon Kim
Trung Bui
Kyomin Jung
174
15
0
01 Apr 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
VIOLIN: A Large-Scale Dataset for Video-and-Language InferenceComputer Vision and Pattern Recognition (CVPR), 2020
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLMVGen
276
75
0
25 Mar 2020
Vision-Dialog Navigation by Exploring Cross-modal Memory
Vision-Dialog Navigation by Exploring Cross-modal MemoryComputer Vision and Pattern Recognition (CVPR), 2020
Yi Zhu
Fengda Zhu
Zhaohuan Zhan
Bingqian Lin
Jianbin Jiao
Xiaojun Chang
Xiaodan Liang
VLM
179
52
0
15 Mar 2020
CRWIZ: A Framework for Crowdsourcing Real-Time Wizard-of-Oz Dialogues
CRWIZ: A Framework for Crowdsourcing Real-Time Wizard-of-Oz DialoguesInternational Conference on Language Resources and Evaluation (LREC), 2020
Javier Chiyah-Garcia
José Lopes
Xingkun Liu
H. Hastie
108
7
0
12 Mar 2020
Learning to Respond with Stickers: A Framework of Unifying
  Multi-Modality in Multi-Turn Dialog
Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn DialogThe Web Conference (WWW), 2020
Shen Gao
Preslav Nakov
Chang Liu
Li Liu
Dongyan Zhao
Rui Yan
219
41
0
10 Mar 2020
MQA: Answering the Question via Robotic Manipulation
MQA: Answering the Question via Robotic Manipulation
Yuhong Deng
Di Guo
F. Sun
Naifu Zhang
Huaping Liu
Chen Pang
254
23
0
10 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Deconfounded Image Captioning: A Causal RetrospectIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
186
149
0
09 Mar 2020
Captioning Images with Novel Objects via Online Vocabulary Expansion
Captioning Images with Novel Objects via Online Vocabulary Expansion
Mikihiro Tanaka
Tatsuya Harada
3DV
211
2
0
06 Mar 2020
Environment-agnostic Multitask Learning for Natural Language Grounded
  Navigation
Environment-agnostic Multitask Learning for Natural Language Grounded NavigationEuropean Conference on Computer Vision (ECCV), 2020
Xinze Wang
Vihan Jain
Eugene Ie
William Yang Wang
Zornitsa Kozareva
Sujith Ravi
LM&Ro
303
70
0
01 Mar 2020
Cops-Ref: A new Dataset and Task on Compositional Referring Expression
  Comprehension
Cops-Ref: A new Dataset and Task on Compositional Referring Expression ComprehensionComputer Vision and Pattern Recognition (CVPR), 2020
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
ObjD
250
78
0
01 Mar 2020
Unshuffling Data for Improved Generalization
Unshuffling Data for Improved GeneralizationIEEE International Conference on Computer Vision (ICCV), 2020
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
248
82
0
27 Feb 2020
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
Thomas Scialom
Patrick Bordes
Paul-Alexis Dray
Jacopo Staiano
Patrick Gallinari
252
7
0
25 Feb 2020
Guessing State Tracking for Visual Dialogue
Guessing State Tracking for Visual DialogueEuropean Conference on Computer Vision (ECCV), 2020
Wei Pang
Xiaojie Wang
OOD
379
10
0
24 Feb 2020
A Multimodal Dialogue System for Conversational Image Editing
A Multimodal Dialogue System for Conversational Image Editing
Tzu-Hsiang Lin
Trung Bui
Doo Soon Kim
Jean Oh
128
9
0
16 Feb 2020
Looking Enhances Listening: Recovering Missing Speech Using Images
Looking Enhances Listening: Recovering Missing Speech Using ImagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Tejas Srinivasan
Ramon Sanabria
Florian Metze
129
15
0
13 Feb 2020
Multimodal Matching Transformer for Live Commenting
Multimodal Matching Transformer for Live CommentingEuropean Conference on Artificial Intelligence (ECAI), 2020
Chaoqun Duan
Lei Cui
Shuming Ma
Furu Wei
Conghui Zhu
Tiejun Zhao
114
13
0
07 Feb 2020
Previous
123...101112789
Next