ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.09368
  4. Cited By
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
v1v2v3 (latest)

Dual Attention Networks for Visual Reference Resolution in Visual Dialog

25 February 2019
Gi-Cheon Kang
Jaeseo Lim
Byoung-Tak Zhang
ArXiv (abs)PDFHTML

Papers citing "Dual Attention Networks for Visual Reference Resolution in Visual Dialog"

36 / 36 papers shown
Take That for Me: Multimodal Exophora Resolution with Interactive Questioning for Ambiguous Out-of-View Instructions
Take That for Me: Multimodal Exophora Resolution with Interactive Questioning for Ambiguous Out-of-View Instructions
Akira Oyama
Shoichi Hasegawa
Akira Taniguchi
Y. Hagiwara
Tadahiro Taniguchi
141
2
0
22 Aug 2025
Enhancing Visual Dialog State Tracking through Iterative Object-Entity
  Alignment in Multi-Round Conversations
Enhancing Visual Dialog State Tracking through Iterative Object-Entity Alignment in Multi-Round Conversations
Wei Pang
Ruixue Duan
Jinfu Yang
Ning Li
180
0
0
13 Aug 2024
ReALM: Reference Resolution As Language Modeling
ReALM: Reference Resolution As Language Modeling
Joel Ruben Antony Moniz
Soundarya Krishnan
Melis Ozyildirim
Prathamesh Saraf
Halim Cagri Ates
Yuan-kang Zhang
Hong-ye Yu
Nidhi Rajshree
313
10
0
29 Mar 2024
$\mathbb{VD}$-$\mathbb{GR}$: Boosting $\mathbb{V}$isual
  $\mathbb{D}$ialog with Cascaded Spatial-Temporal Multi-Modal
  $\mathbb{GR}$aphs
VD\mathbb{VD}VD-GR\mathbb{GR}GR: Boosting V\mathbb{V}Visual D\mathbb{D}Dialog with Cascaded Spatial-Temporal Multi-Modal GR\mathbb{GR}GRaphsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Adnen Abdessaied
Lei Shi
Andreas Bulling
3DH
243
7
0
25 Oct 2023
Thought Cloning: Learning to Think while Acting by Imitating Human
  Thinking
Thought Cloning: Learning to Think while Acting by Imitating Human ThinkingNeural Information Processing Systems (NeurIPS), 2023
Shengran Hu
Jeff Clune
LM&RoOffRLLRMAI4CE
578
40
0
01 Jun 2023
Enabling Harmonious Human-Machine Interaction with Visual-Context
  Augmented Dialogue System: A Review
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
320
3
0
02 Jul 2022
VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution
VD-PCR: Improving Visual Dialog with Pronoun Coreference ResolutionPattern Recognition (Pattern Recogn.), 2022
Xintong Yu
Hongming Zhang
Ruixin Hong
Yangqiu Song
Changshui Zhang
248
17
0
29 May 2022
The Dialog Must Go On: Improving Visual Dialog via Generative
  Self-Training
The Dialog Must Go On: Improving Visual Dialog via Generative Self-TrainingComputer Vision and Pattern Recognition (CVPR), 2022
Gi-Cheon Kang
Sungdong Kim
Jin-Hwa Kim
Donghyun Kwak
Byoung-Tak Zhang
337
17
0
25 May 2022
UTC: A Unified Transformer with Inter-Task Contrastive Learning for
  Visual Dialog
UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual DialogComputer Vision and Pattern Recognition (CVPR), 2022
Cheng Chen
Yudong Zhu
Zhenshan Tan
Qingrong Cheng
Xin Jiang
Qun Liu
X. Gu
334
44
0
01 May 2022
Affective Feedback Synthesis Towards Multimodal Text and Image Data
Affective Feedback Synthesis Towards Multimodal Text and Image Data
Puneet Kumar
Gaurav Bhatt
Omkar Ingle
Daksh Goyal
Balasubramanian Raman
EGVM
301
5
0
23 Mar 2022
Modeling Coreference Relations in Visual Dialog
Modeling Coreference Relations in Visual DialogConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Mingxiao Li
Marie-Francine Moens
175
10
0
06 Mar 2022
OpenViDial 2.0: A Larger-Scale, Open-Domain Dialogue Generation Dataset
  with Visual Contexts
OpenViDial 2.0: A Larger-Scale, Open-Domain Dialogue Generation Dataset with Visual Contexts
Shuhe Wang
Yuxian Meng
Xiaoya Li
Xiaofei Sun
Rongbin Ouyang
Jiwei Li
MLLMVLM
268
24
0
27 Sep 2021
Productivity, Portability, Performance: Data-Centric Python
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
468
116
0
01 Jul 2021
Attend What You Need: Motion-Appearance Synergistic Networks for Video
  Question Answering
Attend What You Need: Motion-Appearance Synergistic Networks for Video Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Ahjeong Seo
Gi-Cheon Kang
J. Park
Byoung-Tak Zhang
250
57
0
19 Jun 2021
Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation
Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation
Shuhe Wang
Yuxian Meng
Xiaofei Sun
Leilei Gan
Rongbin Ouyang
Rui Yan
Tianwei Zhang
Jiwei Li
257
15
0
30 May 2021
Ensemble of MRR and NDCG models for Visual Dialog
Ensemble of MRR and NDCG models for Visual DialogNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Idan Schwartz
329
10
0
15 Apr 2021
Learning Reasoning Paths over Semantic Graphs for Video-grounded
  Dialogues
Learning Reasoning Paths over Semantic Graphs for Video-grounded DialoguesInternational Conference on Learning Representations (ICLR), 2021
Hung Le
Nancy F. Chen
Guosheng Lin
262
18
0
01 Mar 2021
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual
  Contexts
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts
Yuxian Meng
Shuhe Wang
Qinghong Han
Xiaofei Sun
Leilei Gan
Rui Yan
Jiwei Li
453
31
0
30 Dec 2020
Cross-Media Keyphrase Prediction: A Unified Framework with
  Multi-Modality Multi-Head Attention and Image Wordings
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
Yue Wang
Jing Li
Michael R. Lyu
Irwin King
291
21
0
03 Nov 2020
Multimodal Research in Vision and Language: A Review of Current and
  Emerging Trends
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
366
6
0
19 Oct 2020
A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial
  Expressions
A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions
Takuma Udagawa
T. Yamazaki
Akiko Aizawa
290
12
0
07 Oct 2020
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning
  in Visual Dialogue
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual DialogueACM Multimedia (ACM MM), 2020
X. Jiang
Siyi Du
Zengchang Qin
Yajing Sun
Jiahao Yu
355
41
0
11 Aug 2020
Video Question Answering on Screencast Tutorials
Video Question Answering on Screencast TutorialsInternational Joint Conference on Artificial Intelligence (IJCAI), 2020
Wentian Zhao
Seokhwan Kim
N. Xu
Hailin Jin
180
11
0
02 Aug 2020
SeqDialN: Sequential Visual Dialog Networks in Joint Visual-Linguistic Representation SpaceWorkshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc), 2020
Liu Yang
VLM
239
5
0
02 Aug 2020
History for Visual Dialog: Do we really need it?
History for Visual Dialog: Do we really need it?
Shubham Agarwal
Trung Bui
Joon-Young Lee
Ioannis Konstas
Verena Rieser
VLM
290
76
0
08 May 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
VD-BERT: A Unified Vision and Dialog Transformer with BERTConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Yue Wang
Shafiq Joty
Michael R. Lyu
Irwin King
Caiming Xiong
Guosheng Lin
469
110
0
28 Apr 2020
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge
  Transfer
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge TransferConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Gi-Cheon Kang
Junseok Park
Hwaran Lee
Byoung-Tak Zhang
Jin-Hwa Kim
VLM
289
10
0
14 Apr 2020
Iterative Context-Aware Graph Inference for Visual Dialog
Iterative Context-Aware Graph Inference for Visual DialogComputer Vision and Pattern Recognition (CVPR), 2020
Dan Guo
Haibo Wang
Hanwang Zhang
Zhengjun Zha
Meng Wang
336
53
0
05 Apr 2020
Vision-Dialog Navigation by Exploring Cross-modal Memory
Vision-Dialog Navigation by Exploring Cross-modal MemoryComputer Vision and Pattern Recognition (CVPR), 2020
Yi Zhu
Fengda Zhu
Zhaohuan Zhan
Bingqian Lin
Jianbin Jiao
Xiaojun Chang
Xiaodan Liang
VLM
207
53
0
15 Mar 2020
Modality-Balanced Models for Visual Dialogue
Modality-Balanced Models for Visual DialogueAAAI Conference on Artificial Intelligence (AAAI), 2020
Hyounghun Kim
Hao Tan
Joey Tianyi Zhou
155
29
0
17 Jan 2020
Multi-step Joint-Modality Attention Network for Scene-Aware Dialogue
  System
Multi-step Joint-Modality Attention Network for Scene-Aware Dialogue System
Yun-Wei Chu
Kuan-Yen Lin
Chao-Chun Hsu
Lun-Wei Ku
291
22
0
17 Jan 2020
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual DialogAAAI Conference on Artificial Intelligence (AAAI), 2019
Feilong Chen
Fandong Meng
Jiaming Xu
Peng Li
Bo Xu
Jie Zhou
225
35
0
18 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art BaselineEuropean Conference on Computer Vision (ECCV), 2019
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
427
122
0
05 Dec 2019
Efficient Attention Mechanism for Visual Dialog that can Handle All the
  Interactions between Multiple Inputs
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
384
7
0
26 Nov 2019
Two Causal Principles for Improving Visual Dialog
Two Causal Principles for Improving Visual DialogComputer Vision and Pattern Recognition (CVPR), 2019
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
738
162
0
24 Nov 2019
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Multi-step Reasoning via Recurrent Dual Attention for Visual DialogAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Zhe Gan
Yu Cheng
Ahmed El Kholy
Linjie Li
Jingjing Liu
Jianfeng Gao
476
109
0
01 Feb 2019
1
Page 1 of 1