ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.08669
  4. Cited By
Visual Dialog
v1v2v3v4v5 (latest)

Visual Dialog

26 November 2016
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
ArXiv (abs)PDFHTML

Papers citing "Visual Dialog"

50 / 597 papers shown
Spot the Difference: A Cooperative Object-Referring Game in
  Non-Perfectly Co-Observable Scene
Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene
Duo Zheng
Fandong Meng
Q. Si
Hairun Fan
Zipeng Xu
Jie Zhou
Fangxiang Feng
Xiaojie Wang
183
0
0
16 Mar 2022
Taking an Emotional Look at Video Paragraph Captioning
Taking an Emotional Look at Video Paragraph Captioning
Qinyu Li
Tengpeng Li
Hanli Wang
Changan Chen
191
7
0
12 Mar 2022
AssistQ: Affordance-centric Question-driven Task Completion for
  Egocentric Assistant
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric AssistantEuropean Conference on Computer Vision (ECCV), 2022
B. Wong
Joya Chen
You Wu
Stan Weixian Lei
Dongxing Mao
Difei Gao
Mike Zheng Shou
EgoV
456
34
0
08 Mar 2022
Towards Building an Open-Domain Dialogue System Incorporated with
  Internet Memes
Towards Building an Open-Domain Dialogue System Incorporated with Internet MemesIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Hua Lu
Zhen Guo
Chanjuan Li
Yunyi Yang
H. He
Siqi Bao
182
7
0
08 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual
  Concept Recognition
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept RecognitionIEEE transactions on multimedia (IEEE TMM), 2022
Peipei Zhu
Tianlin Li
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
Chen Chen
217
15
0
07 Mar 2022
Modeling Coreference Relations in Visual Dialog
Modeling Coreference Relations in Visual DialogConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Mingxiao Li
Marie-Francine Moens
127
10
0
06 Mar 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large
  Models
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
Shixuan Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TSVLM
211
41
0
03 Mar 2022
CAISE: Conversational Agent for Image Search and Editing
CAISE: Conversational Agent for Image Search and EditingAAAI Conference on Artificial Intelligence (AAAI), 2022
Hyounghun Kim
Doo Soon Kim
Seunghyun Yoon
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
212
6
0
24 Feb 2022
VU-BERT: A Unified framework for Visual Dialog
VU-BERT: A Unified framework for Visual DialogIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Tong Ye
Shijing Si
Jianzong Wang
Rui Wang
Ning Cheng
Jing Xiao
MLLM
181
5
0
22 Feb 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-trainingMachine Intelligence Research (MIR), 2022
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
393
289
0
18 Feb 2022
The slurk Interaction Server Framework: Better Data for Better Dialog
  Models
The slurk Interaction Server Framework: Better Data for Better Dialog ModelsInternational Conference on Language Resources and Evaluation (LREC), 2022
Jana Gotze
Maike Paetzel-Prusmann
Wencke Liermann
Tim Diekmann
David Schlangen
VLM
155
12
0
02 Feb 2022
Debiased-CAM to mitigate systematic error with faithful visual explanations of machine learning
Wencan Zhang
Mariella Dimiccoli
Brian Y. Lim
FAtt
210
1
0
30 Jan 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationInternational Conference on Machine Learning (ICML), 2022
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLMBDLVLMCLIP
1.3K
5,818
0
28 Jan 2022
Interpretable Learned Emergent Communication for Human-Agent Teams
Interpretable Learned Emergent Communication for Human-Agent TeamsIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2022
Seth Karten
Mycal Tucker
Huao Li
Siva Kailas
Michael Lewis
Katia Sycara
AI4CE
265
13
0
19 Jan 2022
Self-directed Machine Learning
Self-directed Machine LearningAI Open (AO), 2022
Wenwu Zhu
Xin Eric Wang
P. Xie
155
6
0
04 Jan 2022
Ditch the Gold Standard: Re-evaluating Conversational Question Answering
Ditch the Gold Standard: Re-evaluating Conversational Question Answering
Huihan Li
Tianyu Gao
Manan Goenka
Danqi Chen
201
23
0
16 Dec 2021
3D Question Answering
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
260
60
0
15 Dec 2021
VALSE: A Task-Independent Benchmark for Vision and Language Models
  Centered on Linguistic Phenomena
VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
Letitia Parcalabescu
Michele Cafagna
Lilitta Muradjan
Anette Frank
Iacer Calixto
Albert Gatt
CoGe
302
135
0
14 Dec 2021
Multimodal Interactions Using Pretrained Unimodal Models for SIMMC 2.0
Multimodal Interactions Using Pretrained Unimodal Models for SIMMC 2.0
Joosung Lee
Kijong Han
227
6
0
10 Dec 2021
Self-Supervised Image-to-Text and Text-to-Image Synthesis
Self-Supervised Image-to-Text and Text-to-Image Synthesis
Anindya Sundar Das
S. Saha
SSL
93
6
0
09 Dec 2021
Iconary: A Pictionary-Based Game for Testing Multimodal Communication
  with Drawings and Text
Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text
Christopher Clark
Jordi Salvador
Dustin Schwenk
Derrick Bonafilia
Mark Yatskar
...
Aaron Sarnat
Hannaneh Hajishirzi
Aniruddha Kembhavi
Oren Etzioni
Ali Farhadi
MLLM
141
7
0
01 Dec 2021
Classification-Regression for Chart Comprehension
Classification-Regression for Chart Comprehension
Matan Levy
Rami Ben-Ari
Dani Lischinski
156
17
0
29 Nov 2021
Building Goal-Oriented Dialogue Systems with Situated Visual Context
Building Goal-Oriented Dialogue Systems with Situated Visual ContextAAAI Conference on Artificial Intelligence (AAAI), 2021
Sanchit Agarwal
Jan Jezabek
Arijit Biswas
Emre Barut
Shuyang Gao
Tagyoung Chung
171
1
0
22 Nov 2021
CoLLIE: Continual Learning of Language Grounding from Language-Image
  Embeddings
CoLLIE: Continual Learning of Language Grounding from Language-Image EmbeddingsJournal of Artificial Intelligence Research (JAIR), 2021
Gabriel Skantze
Bram Willemsen
VLM
215
14
0
15 Nov 2021
NarrationBot and InfoBot: A Hybrid System for Automated Video Description
Shasta Ihorn
Y. Siu
Aditya Bodi
Lothar D Narins
Jose M. Castanon
Yash Kant
Abhishek Das
Ilmi Yoon
Pooyan Fazli
110
6
0
07 Nov 2021
Perceptual Score: What Data Modalities Does Your Model Perceive?
Perceptual Score: What Data Modalities Does Your Model Perceive?
Itai Gat
Idan Schwartz
Alex Schwing
207
43
0
27 Oct 2021
Simple Dialogue System with AUDITED
Simple Dialogue System with AUDITEDBritish Machine Vision Conference (BMVC), 2021
Eugenio Clerico
Piotr Koniusz
205
2
0
22 Oct 2021
Evaluating and Improving Interactions with Hazy Oracles
Evaluating and Improving Interactions with Hazy Oracles
Stephan J. Lemmer
Jason J. Corso
175
2
0
19 Oct 2021
Multimodal Dialogue Response Generation
Multimodal Dialogue Response Generation
Qingfeng Sun
Yujing Wang
Can Xu
Kai Zheng
Yaming Yang
Huang Hu
Fei Xu
Jessica Zhang
Xiubo Geng
Daxin Jiang
248
52
0
16 Oct 2021
A Framework for Learning to Request Rich and Contextually Useful
  Information from Humans
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
Khanh Nguyen
Yonatan Bisk
Hal Daumé
477
21
0
14 Oct 2021
Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual
  Transformers with Joint Student-Teacher Learning
Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Ankit Parag Shah
Shijie Geng
Shiyang Feng
A. Cherian
Takaaki Hori
Tim K. Marks
Jonathan Le Roux
Chiori Hori
206
26
0
13 Oct 2021
Collecting and Characterizing Natural Language Utterances for Specifying
  Data Visualizations
Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations
Arjun Srinivasan
Nikhila Nyapathy
Bongshin Lee
Steven Drucker
J. Stasko
189
83
0
01 Oct 2021
The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese
  Dialogue Dataset for E-commerce Customer Service
The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service
Nan Zhao
Haoran Li
Youzheng Wu
Xiaodong He
Bowen Zhou
143
9
0
27 Sep 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLMVPVLMVLM
580
244
0
24 Sep 2021
Learning Natural Language Generation from Scratch
Learning Natural Language Generation from Scratch
Alice Martin Donati
Guillaume Quispe
Charles Ollion
Sylvain Le Corff
Florian Strub
Olivier Pietquin
LRM
149
4
0
20 Sep 2021
Multimodal Incremental Transformer with Visual Grounding for Visual
  Dialogue Generation
Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation
Feilong Chen
Fandong Meng
Xiuyi Chen
Peng Li
Jie Zhou
183
25
0
17 Sep 2021
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
Feilong Chen
Xiuyi Chen
Fandong Meng
Peng Li
Jie Zhou
272
37
0
17 Sep 2021
Knowledge-based Embodied Question Answering
Knowledge-based Embodied Question Answering
Sinan Tan
Mengmeng Ge
Di Guo
Huaping Liu
F. Sun
266
38
0
16 Sep 2021
Learning to Ground Visual Objects for Visual Dialog
Learning to Ground Visual Objects for Visual Dialog
Feilong Chen
Xiuyi Chen
Can Xu
Daxin Jiang
OOD
192
18
0
13 Sep 2021
Reference-Centric Models for Grounded Collaborative Dialogue
Reference-Centric Models for Grounded Collaborative DialogueConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Daniel Fried
Justin T. Chiu
Dan Klein
179
22
0
10 Sep 2021
We went to look for meaning and all we got were these lousy
  representations: aspects of meaning representation for computational
  semantics
We went to look for meaning and all we got were these lousy representations: aspects of meaning representation for computational semantics
Simon Dobnik
R. Cooper
Adam Ek
Bill Noble
Staffan Larsson
N. Ilinykh
Vladislav Maraev
Vidya Somashekarappa
138
0
0
10 Sep 2021
Exophoric Pronoun Resolution in Dialogues with Topic Regularization
Exophoric Pronoun Resolution in Dialogues with Topic RegularizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Xintong Yu
Hongming Zhang
Yangqiu Song
Changshui Zhang
Kun Xu
Dong Yu
151
5
0
10 Sep 2021
Enhancing Visual Dialog Questioner with Entity-based Strategy Learning
  and Augmented Guesser
Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented GuesserConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Duo Zheng
Zipeng Xu
Fandong Meng
Caixia Yuan
Jiaan Wang
Jie Zhou
140
13
0
06 Sep 2021
Towards Expressive Communication with Internet Memes: A New Multimodal
  Conversation Dataset and Benchmark
Towards Expressive Communication with Internet Memes: A New Multimodal Conversation Dataset and Benchmark
Zhengcong Fei
Zekang Li
Jinchao Zhang
Yang Feng
Jie Zhou
135
22
0
04 Sep 2021
MMChat: Multi-Modal Chat Dataset on Social Media
MMChat: Multi-Modal Chat Dataset on Social Media
Yinhe Zheng
Guanyi Chen
Xin Liu
K. Lin
333
38
0
16 Aug 2021
Embodied BERT: A Transformer Model for Embodied, Language-guided Visual
  Task Completion
Embodied BERT: A Transformer Model for Embodied, Language-guided Visual Task Completion
Alessandro Suglia
Qiaozi Gao
Jesse Thomason
Govind Thattai
Gaurav Sukhatme
LM&Ro
284
84
0
10 Aug 2021
Hybrid Reasoning Network for Video-based Commonsense Captioning
Hybrid Reasoning Network for Video-based Commonsense CaptioningACM Multimedia (ACM MM), 2021
Weijiang Yu
Jian Liang
Lei Ji
Lu Li
Yuejian Fang
Nong Xiao
Nan Duan
193
11
0
05 Aug 2021
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable
  Video Captioning
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video CaptioningFindings (Findings), 2021
Fenglin Liu
Xuancheng Ren
Xian Wu
Bang-ju Yang
Shen Ge
Yuexian Zou
Xu Sun
246
38
0
05 Aug 2021
Chest ImaGenome Dataset for Clinical Reasoning
Chest ImaGenome Dataset for Clinical Reasoning
Joy T. Wu
Nkechinyere N. Agu
Ismini Lourentzou
Arjun Sharma
J. Paguio
...
William Mitchell
Satyananda Kashyap
Andrea Giovannini
Leo Anthony Celi
Mehdi Moradi
247
91
0
31 Jul 2021
Adversarial Reinforced Instruction Attacker for Robust Vision-Language
  Navigation
Adversarial Reinforced Instruction Attacker for Robust Vision-Language NavigationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Bingqian Lin
Yi Zhu
Yanxin Long
Xiaodan Liang
QiXiang Ye
Liang Lin
AAML
204
20
0
23 Jul 2021
Previous
123...567...101112
Next