ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.08669
  4. Cited By
Visual Dialog
v1v2v3v4v5 (latest)

Visual Dialog

26 November 2016
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
ArXiv (abs)PDFHTML

Papers citing "Visual Dialog"

50 / 597 papers shown
Dialogue Object Search
Dialogue Object Search
Monica V. Roy
Kaiyu Zheng
Jason Liu
Stefanie Tellex
LM&Ro
160
1
0
22 Jul 2021
Constructing Multi-Modal Dialogue Dataset by Replacing Text with
  Semantically Relevant Images
Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant ImagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Nyoungwoo Lee
Suwon Shin
Jaegul Choo
Ho‐Jin Choi
S. Myaeng
179
31
0
19 Jul 2021
Modeling Explicit Concerning States for Reinforcement Learning in Visual
  Dialogue
Modeling Explicit Concerning States for Reinforcement Learning in Visual Dialogue
Zipeng Xu
Fandong Meng
Caixia Yuan
Duo Zheng
Chenxu Lv
Jie Zhou
OffRL
175
6
0
12 Jul 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
2.2K
8,106
0
07 Jul 2021
Productivity, Portability, Performance: Data-Centric Python
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
416
112
0
01 Jul 2021
Unified Questioner Transformer for Descriptive Question Generation in
  Goal-Oriented Visual Dialogue
Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual DialogueIEEE International Conference on Computer Vision (ICCV), 2021
Shoya Matsumori
Kosuke Shingyouchi
Yukikoko Abe
Yosuke Fukuchi
K. Sugiura
M. Imai
184
17
0
29 Jun 2021
Saying the Unseen: Video Descriptions via Dialog Agents
Saying the Unseen: Video Descriptions via Dialog AgentsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
213
8
0
26 Jun 2021
Exploring Semantic Relationships for Unpaired Image Captioning
Exploring Semantic Relationships for Unpaired Image Captioning
Fenglin Liu
Meng Gao
Tianhao Zhang
Yuexian Zou
316
7
0
20 Jun 2021
$C^3$: Compositional Counterfactual Contrastive Learning for
  Video-grounded Dialogues
C3C^3C3: Compositional Counterfactual Contrastive Learning for Video-grounded Dialogues
Hung Le
Nancy F. Chen
Guosheng Lin
151
2
0
16 Jun 2021
Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused
  Interventions
Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused InterventionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Daniel Rosenberg
Itai Gat
Amir Feder
Roi Reichart
AAML
276
16
0
08 Jun 2021
Maria: A Visual Experience Powered Conversational Agent
Maria: A Visual Experience Powered Conversational AgentAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Zujie Liang
Huang Hu
Can Xu
Chongyang Tao
Xiubo Geng
Yining Chen
Fan Liang
Daxin Jiang
204
33
0
27 May 2021
Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic
  Representation
Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic RepresentationComputer Vision and Pattern Recognition (CVPR), 2021
Tao Tu
Q. Ping
Govind Thattai
Gokhan Tur
Premkumar Natarajan
188
18
0
24 May 2021
Conversational AI Systems for Social Good: Opportunities and Challenges
Conversational AI Systems for Social Good: Opportunities and Challenges
Peng Qi
Jing Huang
Youzheng Wu
Xiaodong He
Bowen Zhou
242
5
0
13 May 2021
SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal
  Conversations
SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal ConversationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Satwik Kottur
Seungwhan Moon
A. Geramifard
Babak Damavandi
243
98
0
18 Apr 2021
Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language
  Models
Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models
Tejas Srinivasan
Yonatan Bisk
VLM
309
63
0
18 Apr 2021
Ensemble of MRR and NDCG models for Visual Dialog
Ensemble of MRR and NDCG models for Visual DialogNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Idan Schwartz
273
10
0
15 Apr 2021
BERT Embeddings Can Track Context in Conversational Search
BERT Embeddings Can Track Context in Conversational Search
Rafael Ferreira
David Semedo
João Magalhães
AI4TS
130
0
0
13 Apr 2021
Action-Based Conversations Dataset: A Corpus for Building More In-Depth
  Task-Oriented Dialogue Systems
Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue SystemsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Derek Chen
Howard Chen
Yi Yang
A. Lin
Zhou Yu
215
78
0
01 Apr 2021
Towards General Purpose Vision Systems
Towards General Purpose Vision SystemsComputer Vision and Pattern Recognition (CVPR), 2021
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
298
56
0
01 Apr 2021
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Kaleido-BERT: Vision-Language Pre-training on Fashion DomainComputer Vision and Pattern Recognition (CVPR), 2021
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Linbo Jin
Ben Chen
Hao Zhou
Minghui Qiu
Ling Shao
VLM
350
135
0
30 Mar 2021
Structured Co-reference Graph Attention for Video-grounded Dialogue
Structured Co-reference Graph Attention for Video-grounded DialogueAAAI Conference on Artificial Intelligence (AAAI), 2021
Junyeong Kim
Sunjae Yoon
Dahyun Kim
Chang D. Yoo
203
30
0
24 Mar 2021
The Interplay of Task Success and Dialogue Quality: An in-depth
  Evaluation in Task-Oriented Visual Dialogues
The Interplay of Task Success and Dialogue Quality: An in-depth Evaluation in Task-Oriented Visual DialoguesConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
A. Testoni
Raffaella Bernardi
98
4
0
20 Mar 2021
Overprotective Training Environments Fall Short at Testing Time: Let
  Models Contribute to Their Own Training
Overprotective Training Environments Fall Short at Testing Time: Let Models Contribute to Their Own TrainingItalian Conference on Computational Linguistics (CLiC-it), 2021
A. Testoni
Raffaella Bernardi
126
2
0
20 Mar 2021
What is Multimodality?
What is Multimodality?
Letitia Parcalabescu
Nils Trost
Anette Frank
230
0
0
10 Mar 2021
MultiSubs: A Large-scale Multimodal and Multilingual Dataset
MultiSubs: A Large-scale Multimodal and Multilingual DatasetInternational Conference on Language Resources and Evaluation (LREC), 2021
Josiah Wang
Pranava Madhyastha
J. Figueiredo
Chiraag Lala
Lucia Specia
VGen
197
13
0
02 Mar 2021
Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with
  Partial Query
Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial QueryIEEE International Conference on Computer Vision (ICCV), 2021
Guanyu Cai
Jun Zhang
Xinyang Jiang
Yifei Gong
Lianghua He
Fufu Yu
Pai Peng
Xiaowei Guo
Feiyue Huang
Xing Sun
236
17
0
02 Mar 2021
Learning Reasoning Paths over Semantic Graphs for Video-grounded
  Dialogues
Learning Reasoning Paths over Semantic Graphs for Video-grounded DialoguesInternational Conference on Learning Representations (ICLR), 2021
Hung Le
Nancy F. Chen
Guosheng Lin
190
18
0
01 Mar 2021
Learning Compositional Representation for Few-shot Visual Question
  Answering
Learning Compositional Representation for Few-shot Visual Question Answering
Dalu Guo
Dacheng Tao
OODCoGe
153
4
0
21 Feb 2021
I Want This Product but Different : Multimodal Retrieval with Synthetic Query Expansion
Ivona Tautkute
Tomasz Trzciñski
235
5
0
17 Feb 2021
An Empirical Study on the Generalization Power of Neural Representations
  Learned via Visual Guessing Games
An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing GamesConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Alessandro Suglia
Yonatan Bisk
Ioannis Konstas
Antonio Vergari
E. Bastianelli
Andrea Vanzo
Oliver Lemon
152
8
0
31 Jan 2021
VX2TEXT: End-to-End Learning of Video-Based Text Generation From
  Multimodal Inputs
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal InputsComputer Vision and Pattern Recognition (CVPR), 2021
Xudong Lin
Gedas Bertasius
Jue Wang
Shih-Fu Chang
Devi Parikh
Lorenzo Torresani
VGen
252
74
0
28 Jan 2021
DOC2PPT: Automatic Presentation Slides Generation from Scientific
  Documents
DOC2PPT: Automatic Presentation Slides Generation from Scientific DocumentsAAAI Conference on Artificial Intelligence (AAAI), 2021
Tsu-Jui Fu
Wenjie Wang
Daniel J. McDuff
Yale Song
305
72
0
28 Jan 2021
Adversarial Text-to-Image Synthesis: A Review
Adversarial Text-to-Image Synthesis: A ReviewNeural Networks (NN), 2021
Stanislav Frolov
Tobias Hinz
Federico Raue
Jörn Hees
Andreas Dengel
EGVM
322
202
0
25 Jan 2021
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded
  Dialogue
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded DialogueAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Hung Le
Chinnadhurai Sankar
Seungwhan Moon
Ahmad Beirami
A. Geramifard
Satwik Kottur
VGen
272
23
0
01 Jan 2021
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Image-to-Image Retrieval by Learning Similarity between Scene GraphsAAAI Conference on Artificial Intelligence (AAAI), 2020
Sangwoong Yoon
Woo-Young Kang
Sungwook Jeon
SeongEun Lee
C. Han
Jonghun Park
Eun-Sol Kim
3DH
220
54
0
29 Dec 2020
On Modality Bias in the TVQA Dataset
On Modality Bias in the TVQA DatasetBritish Machine Vision Conference (BMVC), 2020
T. Winterbottom
S. Xiao
A. McLean
Noura Al Moubayed
174
44
0
18 Dec 2020
A Response Retrieval Approach for Dialogue Using a Multi-Attentive
  Transformer
A Response Retrieval Approach for Dialogue Using a Multi-Attentive Transformer
M. A. Senese
A. Benincasa
Barbara Caputo
Giuseppe Rizzo
117
4
0
15 Dec 2020
Learning Contextual Causality from Time-consecutive Images
Learning Contextual Causality from Time-consecutive Images
Hongming Zhang
Yintong Huo
Xinran Zhao
Yangqiu Song
Dan Roth
CML
147
6
0
13 Dec 2020
Look Before you Speak: Visually Contextualized Utterances
Look Before you Speak: Visually Contextualized UtterancesComputer Vision and Pattern Recognition (CVPR), 2020
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
313
71
0
10 Dec 2020
Debiased-CAM to mitigate image perturbations with faithful visual
  explanations of machine learning
Debiased-CAM to mitigate image perturbations with faithful visual explanations of machine learningInternational Conference on Human Factors in Computing Systems (CHI), 2020
Wencan Zhang
Mariella Dimiccoli
Brian Y. Lim
FAtt
377
20
0
10 Dec 2020
Point and Ask: Incorporating Pointing into Visual Question Answering
Point and Ask: Incorporating Pointing into Visual Question Answering
Arjun Mani
Nobline Yoo
William Fu-Hinthorn
Olga Russakovsky
3DPC
397
42
0
27 Nov 2020
A Recurrent Vision-and-Language BERT for Navigation
A Recurrent Vision-and-Language BERT for NavigationComputer Vision and Pattern Recognition (CVPR), 2020
Yicong Hong
Qi Wu
Yuankai Qi
Cristian Rodriguez-Opazo
Stephen Gould
LM&Ro
326
385
0
26 Nov 2020
Improving Calibration in Deep Metric Learning With Cross-Example Softmax
Improving Calibration in Deep Metric Learning With Cross-Example Softmax
Andreas Veit
Kimberly Wilber
72
3
0
17 Nov 2020
Where Are You? Localization from Embodied Dialog
Where Are You? Localization from Embodied DialogConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Meera Hahn
Jacob Krantz
Dhruv Batra
Devi Parikh
James M. Rehg
Stefan Lee
Peter Anderson
LM&Ro
195
33
0
16 Nov 2020
Refer, Reuse, Reduce: Generating Subsequent References in Visual and
  Conversational Contexts
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Ece Takmaz
Mario Giulianelli
Sandro Pezzelle
Arabella J. Sinclair
Raquel Fernández
149
31
0
09 Nov 2020
DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image
  Generation
DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation
Zhenxing Zhang
Lambert Schomaker
GAN
266
43
0
05 Nov 2020
Learning to Respond with Your Favorite Stickers: A Framework of Unifying
  Multi-Modality and User Preference in Multi-Turn Dialog
Learning to Respond with Your Favorite Stickers: A Framework of Unifying Multi-Modality and User Preference in Multi-Turn Dialog
Shen Gao
Preslav Nakov
Li Liu
Dongyan Zhao
Rui Yan
207
18
0
05 Nov 2020
Reasoning Over History: Context Aware Visual Dialog
Reasoning Over History: Context Aware Visual Dialog
Muhammad A. Shah
Shikib Mehri
Tejas Srinivasan
158
4
0
02 Nov 2020
Co-attentional Transformers for Story-Based Video Understanding
Co-attentional Transformers for Story-Based Video UnderstandingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Björn Bebensee
Byoung-Tak Zhang
144
7
0
27 Oct 2020
Reading Between the Lines: Exploring Infilling in Visual Narratives
Reading Between the Lines: Exploring Infilling in Visual NarrativesConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Khyathi Chandu
Ruo-Ping Dong
A. Black
153
4
0
26 Oct 2020
Previous
123...678...101112
Next
Page 7 of 12
Pageof 12