ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.08669
  4. Cited By
Visual Dialog
v1v2v3v4v5 (latest)

Visual Dialog

26 November 2016
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
ArXiv (abs)PDFHTML

Papers citing "Visual Dialog"

50 / 597 papers shown
Title
Bridging Text and Video: A Universal Multimodal Transformer for
  Video-Audio Scene-Aware Dialog
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang Li
Zongjia Li
Jinchao Zhang
Yang Feng
Cheng Niu
Jie Zhou
218
38
0
01 Feb 2020
Deep Bayesian Network for Visual Question Generation
Deep Bayesian Network for Visual Question GenerationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
142
18
0
23 Jan 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
ManyModalQA: Modality Disambiguation and QA over Diverse InputsAAAI Conference on Artificial Intelligence (AAAI), 2020
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
152
63
0
22 Jan 2020
Modality-Balanced Models for Visual Dialogue
Modality-Balanced Models for Visual DialogueAAAI Conference on Artificial Intelligence (AAAI), 2020
Hyounghun Kim
Hao Tan
Joey Tianyi Zhou
93
29
0
17 Jan 2020
Multi-step Joint-Modality Attention Network for Scene-Aware Dialogue
  System
Multi-step Joint-Modality Attention Network for Scene-Aware Dialogue System
Yun-Wei Chu
Kuan-Yen Lin
Chao-Chun Hsu
Lun-Wei Ku
213
22
0
17 Jan 2020
All-in-One Image-Grounded Conversational Agents
All-in-One Image-Grounded Conversational Agents
Da Ju
Kurt Shuster
Y-Lan Boureau
Jason Weston
LLMAG
137
9
0
28 Dec 2019
Leveraging Topics and Audio Features with Multimodal Attention for Audio
  Visual Scene-Aware Dialog
Leveraging Topics and Audio Features with Multimodal Attention for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Jonathan Huang
L. Nachman
114
7
0
20 Dec 2019
Deep Exemplar Networks for VQA and VQG
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
110
4
0
19 Dec 2019
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual DialogAAAI Conference on Artificial Intelligence (AAAI), 2019
Feilong Chen
Fandong Meng
Jiaming Xu
Peng Li
Bo Xu
Jie Zhou
165
34
0
18 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art BaselineEuropean Conference on Computer Vision (ECCV), 2019
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
300
120
0
05 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1: Multi-Task Vision and Language Representation LearningComputer Vision and Pattern Recognition (CVPR), 2019
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLMObjD
291
499
0
05 Dec 2019
A Free Lunch in Generating Datasets: Building a VQG and VQA System with
  Attention and Humans in the Loop
A Free Lunch in Generating Datasets: Building a VQG and VQA System with Attention and Humans in the Loop
Jihyeon Janel Lee
S. Arora
166
1
0
30 Nov 2019
Multimodal Attention Networks for Low-Level Vision-and-Language
  Navigation
Multimodal Attention Networks for Low-Level Vision-and-Language NavigationComputer Vision and Image Understanding (CVIU), 2019
Federico Landi
Lorenzo Baraldi
Marcella Cornia
M. Corsini
Rita Cucchiara
LM&Ro
236
30
0
27 Nov 2019
Transfer Learning in Visual and Relational Reasoning
Transfer Learning in Visual and Relational Reasoning
T. S. Jayram
Vincent Marois
Tomasz Kornuta
V. Albouy
Emre Sevgen
A. Ozcan
NAIOODLRM
186
3
0
27 Nov 2019
Efficient Attention Mechanism for Visual Dialog that can Handle All the
  Interactions between Multiple Inputs
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
251
7
0
26 Nov 2019
Two Causal Principles for Improving Visual Dialog
Two Causal Principles for Improving Visual DialogComputer Vision and Pattern Recognition (CVPR), 2019
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
534
159
0
24 Nov 2019
Unsupervised Keyword Extraction for Full-sentence VQA
Unsupervised Keyword Extraction for Full-sentence VQA
Kohei Uehara
Tatsuya Harada
180
1
0
23 Nov 2019
Learning Cross-modal Context Graph for Visual Grounding
Learning Cross-modal Context Graph for Visual GroundingAAAI Conference on Artificial Intelligence (AAAI), 2019
Yongfei Liu
Bo Wan
Xiao-Dan Zhu
Xuming He
227
98
0
20 Nov 2019
An Annotated Corpus of Reference Resolution for Interpreting Common
  Grounding
An Annotated Corpus of Reference Resolution for Interpreting Common GroundingAAAI Conference on Artificial Intelligence (AAAI), 2019
Takuma Udagawa
Akiko Aizawa
116
10
0
18 Nov 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual DialogueAAAI Conference on Artificial Intelligence (AAAI), 2019
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
175
71
0
17 Nov 2019
Visual Dialogue State Tracking for Question Generation
Visual Dialogue State Tracking for Question GenerationAAAI Conference on Artificial Intelligence (AAAI), 2019
Wei Pang
Xiaojie Wang
146
34
0
12 Nov 2019
Drill-down: Interactive Retrieval of Complex Scenes using Natural
  Language Queries
Drill-down: Interactive Retrieval of Complex Scenes using Natural Language QueriesNeural Information Processing Systems (NeurIPS), 2019
Fuwen Tan
Paola Cascante-Bonilla
Xiaoxiao Guo
Hui Wu
Song Feng
Vicente Ordonez
151
33
0
10 Nov 2019
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded
  Conversational Agents
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational AgentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Kurt Shuster
Da Ju
Stephen Roller
Emily Dinan
Y-Lan Boureau
Jason Weston
230
84
0
09 Nov 2019
SIMMC: Situated Interactive Multi-Modal Conversational Data Collection
  And Evaluation Platform
SIMMC: Situated Interactive Multi-Modal Conversational Data Collection And Evaluation Platform
Paul A. Crook
Shivani Poddar
Ankita De
Semir Shafi
David Whitney
A. Geramifard
R. Subba
128
18
0
07 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRMReLM
292
10
0
31 Oct 2019
Automatic Reminiscence Therapy for Dementia
Automatic Reminiscence Therapy for DementiaInternational Conference on Multimedia Retrieval (ICMR), 2019
Mariona Carós
M. Garolera
Petia Radeva
Xavier Giró-i-Nieto
155
45
0
25 Oct 2019
Heterogeneous Graph Learning for Visual Commonsense Reasoning
Heterogeneous Graph Learning for Visual Commonsense ReasoningNeural Information Processing Systems (NeurIPS), 2019
Weijiang Yu
Jingwen Zhou
Weihao Yu
Xiaodan Liang
Nong Xiao
LRM
111
52
0
25 Oct 2019
Cross-Lingual Vision-Language Navigation
Cross-Lingual Vision-Language Navigation
An Yan
Xinze Wang
Jiangtao Feng
Lei Li
William Yang Wang
LM&Ro
153
17
0
24 Oct 2019
PyTorchPipe: a framework for rapid prototyping of pipelines combining
  language and vision
PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision
Tomasz Kornuta
115
3
0
18 Oct 2019
Dynamic Attention Networks for Task Oriented Grounding
Dynamic Attention Networks for Task Oriented Grounding
S. Dasgupta
Badri N. Patro
Vinay P. Namboodiri
150
1
0
14 Oct 2019
Granular Multimodal Attention Networks for Visual Dialog
Granular Multimodal Attention Networks for Visual Dialog
Badri N. Patro
Shivansh Patel
Vinay P. Namboodiri
200
2
0
13 Oct 2019
Improving Generative Visual Dialog by Answering Diverse Questions
Improving Generative Visual Dialog by Answering Diverse QuestionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Vishvak Murahari
Prithvijit Chattopadhyay
Dhruv Batra
Devi Parikh
Abhishek Das
143
38
0
23 Sep 2019
Probabilistic framework for solving Visual Dialog
Probabilistic framework for solving Visual DialogPattern Recognition (Pattern Recognit.), 2019
Badri N. Patro
Anupriy
Vinay P. Namboodiri
BDL
281
13
0
11 Sep 2019
Conditional Text Generation for Harmonious Human-Machine Interaction
Conditional Text Generation for Harmonious Human-Machine Interaction
Bin Guo
Hao Wang
Yasan Ding
Wei Wu
Shaoyang Hao
Yueqi Sun
Zhiwen Yu
159
4
0
08 Sep 2019
Building Task-Oriented Visual Dialog Systems Through Alternative
  Optimization Between Dialog Policy and Language Generation
Building Task-Oriented Visual Dialog Systems Through Alternative Optimization Between Dialog Policy and Language GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Mingyang Zhou
Josh Arnold
Zhou Yu
OffRL
148
11
0
06 Sep 2019
What You See is What You Get: Visual Pronoun Coreference Resolution in
  Dialogues
What You See is What You Get: Visual Pronoun Coreference Resolution in DialoguesConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Xintong Yu
Hongming Zhang
Yangqiu Song
Yan Song
Changshui Zhang
90
33
0
01 Sep 2019
Grounded Agreement Games: Emphasizing Conversational Grounding in Visual
  Dialogue Settings
Grounded Agreement Games: Emphasizing Conversational Grounding in Visual Dialogue Settings
David Schlangen
107
16
0
29 Aug 2019
ViCo: Word Embeddings from Visual Co-occurrences
ViCo: Word Embeddings from Visual Co-occurrencesIEEE International Conference on Computer Vision (ICCV), 2019
Tanmay Gupta
Alex Schwing
Derek Hoiem
127
25
0
22 Aug 2019
Towards Knowledge-Based Recommender Dialog System
Towards Knowledge-Based Recommender Dialog SystemConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Qibin Chen
Junyang Lin
Yichang Zhang
Ming Ding
Yukuo Cen
Hongxia Yang
Jie Tang
160
284
0
15 Aug 2019
Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling
Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling
Yi-Ting Yeh
Tzu-Chuan Lin
Hsiao-Hua Cheng
Yuanyuan Deng
Shang-Yu Su
Yun-Nung Chen
162
16
0
14 Aug 2019
Transferable Representation Learning in Vision-and-Language Navigation
Transferable Representation Learning in Vision-and-Language NavigationIEEE International Conference on Computer Vision (ICCV), 2019
Haoshuo Huang
Vihan Jain
Harsh Mehta
Alexander Ku
Gabriel Ilharco
Jason Baldridge
Eugene Ie
LM&Ro
185
92
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language TasksNeural Information Processing Systems (NeurIPS), 2019
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSLVLM
892
4,180
0
06 Aug 2019
Learning Question-Guided Video Representation for Multi-Turn Video
  Question Answering
Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Guan-Lin Chao
Abhinav Rastogi
Semih Yavuz
Dilek Z. Hakkani-Tür
Jindong Chen
Ian Lane
79
6
0
31 Jul 2019
V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive
  Matrices
V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive MatricesAAAI Conference on Artificial Intelligence (AAAI), 2019
Damien Teney
Peng Wang
Jiewei Cao
Lingqiao Liu
Chunhua Shen
Anton Van Den Hengel
133
36
0
29 Jul 2019
What Should I Ask? Using Conversationally Informative Rewards for
  Goal-Oriented Visual Dialog
What Should I Ask? Using Conversationally Informative Rewards for Goal-Oriented Visual DialogAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Pushkar Shukla
Carlos E. L. Elmadjian
Richika Sharan
Vivek Kulkarni
Matthew Turk
William Yang Wang
172
34
0
28 Jul 2019
Cooperative image captioning
Cooperative image captioning
Gilad Vered
Gal Oren
Yuval Atzmon
Gal Chechik
117
2
0
26 Jul 2019
Learning Goal-Oriented Visual Dialog Agents: Imitating and Surpassing
  Analytic Experts
Learning Goal-Oriented Visual Dialog Agents: Imitating and Surpassing Analytic ExpertsIEEE International Conference on Multimedia and Expo (ICME), 2019
Yenchih Chang
Wen-Hsiao Peng
102
4
0
24 Jul 2019
Bilinear Graph Networks for Visual Question Answering
Bilinear Graph Networks for Visual Question AnsweringIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2019
Dalu Guo
Chang Xu
Dacheng Tao
GNN
173
67
0
23 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and MethodsJournal of Artificial Intelligence Research (JAIR), 2019
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
388
142
0
22 Jul 2019
Why Build an Assistant in Minecraft?
Why Build an Assistant in Minecraft?
Arthur Szlam
Jonathan Gray
Kavya Srinet
Yacine Jernite
Armand Joulin
...
Siddharth Goyal
Demi Guo
Dan Rothermel
C. L. Zitnick
Jason Weston
LLMAG
255
31
0
22 Jul 2019
Previous
123...10111289
Next