ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.08669
  4. Cited By
Visual Dialog
v1v2v3v4v5 (latest)

Visual Dialog

26 November 2016
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
ArXiv (abs)PDFHTML

Papers citing "Visual Dialog"

50 / 597 papers shown
Title
Chat-crowd: A Dialog-based Platform for Visual Layout Composition
Chat-crowd: A Dialog-based Platform for Visual Layout Composition
Paola Cascante-Bonilla
Xuwang Yin
Vicente Ordonez
Song Feng
189
8
0
10 Dec 2018
Recursive Visual Attention in Visual Dialog
Recursive Visual Attention in Visual Dialog
Yulei Niu
Hanwang Zhang
Manli Zhang
Jianhong Zhang
Zhiwu Lu
Ji-Rong Wen
202
122
0
06 Dec 2018
Multi-task Learning of Hierarchical Vision-Language Representation
Multi-task Learning of Hierarchical Vision-Language Representation
Duy-Kien Nguyen
Takayuki Okatani
224
56
0
03 Dec 2018
Traversing the Continuous Spectrum of Image Retrieval with Deep Dynamic
  Models
Traversing the Continuous Spectrum of Image Retrieval with Deep Dynamic Models
Ziad Al-Halah
Andreas M. Lehrmann
Leonid Sigal
188
0
0
01 Dec 2018
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning
  for Vision-Language Navigation
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language NavigationComputer Vision and Pattern Recognition (CVPR), 2018
Xin Eric Wang
Qiuyuan Huang
Asli Celikyilmaz
Jianfeng Gao
Dinghan Shen
Yuan-fang Wang
William Yang Wang
Lei Zhang
LM&RoSSL
347
591
0
25 Nov 2018
Tell, Draw, and Repeat: Generating and Modifying Images Based on
  Continual Linguistic Instruction
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic InstructionIEEE International Conference on Computer Vision (ICCV), 2018
Alaaeldin El-Nouby
Shikhar Sharma
Hannes Schulz
Devon Hjelm
Layla El Asri
Samira Ebrahimi Kahou
Yoshua Bengio
Graham W.Taylor
VLM
254
127
0
24 Nov 2018
Semantic bottleneck for computer vision tasks
Semantic bottleneck for computer vision tasksAsian Conference on Computer Vision (ACCV), 2018
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
145
17
0
06 Nov 2018
Image Chat: Engaging Grounded Conversations
Image Chat: Engaging Grounded Conversations
Kurt Shuster
Samuel Humeau
Antoine Bordes
Jason Weston
297
122
0
02 Nov 2018
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual
  Question Answering
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering
Medhini Narasimhan
Svetlana Lazebnik
Alex Schwing
NAIGNNReLM
138
11
0
01 Nov 2018
A Corpus for Reasoning About Natural Language Grounded in Photographs
A Corpus for Reasoning About Natural Language Grounded in Photographs
Alane Suhr
Stephanie Zhou
Ally Zhang
Iris Zhang
Huajun Bai
Yoav Artzi
LRM
417
668
0
01 Nov 2018
How2: A Large-scale Dataset for Multimodal Language Understanding
How2: A Large-scale Dataset for Multimodal Language Understanding
Ramon Sanabria
Ozan Caglayan
Shruti Palaskar
Desmond Elliott
Loïc Barrault
Lucia Specia
Florian Metze
VGenMLLM
223
312
0
01 Nov 2018
Dial2Desc: End-to-end Dialogue Description Generation
Dial2Desc: End-to-end Dialogue Description Generation
Haojie Pan
Junpei Zhou
Zhou Zhao
Yan Liu
Deng Cai
Min Yang
VLM
98
14
0
01 Nov 2018
Fabrik: An Online Collaborative Neural Network Editor
Fabrik: An Online Collaborative Neural Network Editor
Utsav Garg
Viraj Prabhu
Deshraj Yadav
Ram Ramrakhya
Harsh Agrawal
Dhruv Batra
GNN
106
4
0
27 Oct 2018
Engaging Image Captioning Via Personality
Engaging Image Captioning Via Personality
Kurt Shuster
Samuel Humeau
Hexiang Hu
Antoine Bordes
Jason Weston
136
160
0
25 Oct 2018
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
Shubham Agarwal
Ondrej Dusek
Ioannis Konstas
Verena Rieser
141
22
0
20 Oct 2018
Learning to Globally Edit Images with Textual Description
Learning to Globally Edit Images with Textual Description
Hai Wang
Jason D. Williams
Sin-Han Kang
DiffM
131
18
0
13 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial
  Regularization
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
217
259
0
08 Oct 2018
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition
Jianwei Yang
Jiasen Lu
Stefan Lee
Dhruv Batra
Devi Parikh
181
42
0
01 Oct 2018
A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC
A Qualitative Comparison of CoQA, SQuAD 2.0 and QuACNorth American Chapter of the Association for Computational Linguistics (NAACL), 2018
Mark Yatskar
189
102
0
27 Sep 2018
Neural Approaches to Conversational AI
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
397
713
0
21 Sep 2018
Talking to myself: self-dialogues as data for conversational agents
Talking to myself: self-dialogues as data for conversational agents
Joachim Fainberg
Ben Krause
M. Dobre
Marco Damonte
Emmanuel Kahembwe
Daniel Duma
Bonnie Webber
Federico Fancellu
159
13
0
18 Sep 2018
LiveBot: Generating Live Video Comments Based on Visual and Textual
  Contexts
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
Shuming Ma
Lei Cui
Damai Dai
Furu Wei
Xu Sun
VGen
162
64
0
13 Sep 2018
Game-Based Video-Context Dialogue
Game-Based Video-Context Dialogue
Ramakanth Pasunuru
Joey Tianyi Zhou
139
36
0
12 Sep 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in
  the Evaluation of VQA
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA
Shailza Jolly
Sandro Pezzelle
T. Klein
Andreas Dengel
Moin Nabi
88
2
0
12 Sep 2018
Beyond task success: A closer look at jointly learning to see, ask, and
  GuessWhat
Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat
Ravi Shekhar
Aashish Venkatesh
Tim Baumgärtner
Elia Bruni
Barbara Plank
Raffaella Bernardi
Raquel Fernández
130
51
0
10 Sep 2018
Visual Coreference Resolution in Visual Dialog using Neural Module
  Networks
Visual Coreference Resolution in Visual Dialog using Neural Module Networks
Satwik Kottur
José M. F. Moura
Devi Parikh
Dhruv Batra
Marcus Rohrbach
186
168
0
06 Sep 2018
TVQA: Localized, Compositional Video Question Answering
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
408
714
0
05 Sep 2018
Learning a Policy for Opportunistic Active Learning
Learning a Policy for Opportunistic Active Learning
Aishwarya Padmakumar
Peter Stone
Raymond J. Mooney
173
22
0
29 Aug 2018
Interpretation of Natural Language Rules in Conversational Machine
  Reading
Interpretation of Natural Language Rules in Conversational Machine Reading
Marzieh Saeidi
Max Bartolo
Patrick Lewis
Sameer Singh
Tim Rocktaschel
Mike Sheldon
Guillaume Bouchard
Sebastian Riedel
112
164
0
28 Aug 2018
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and
  Comprehensive Image Captions
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Fenglin Liu
Xuancheng Ren
Yuanxin Liu
Houfeng Wang
Xu Sun
189
69
0
27 Aug 2018
CoQA: A Conversational Question Answering Challenge
CoQA: A Conversational Question Answering Challenge
Siva Reddy
Danqi Chen
Christopher D. Manning
RALMHAI
338
1,310
0
21 Aug 2018
QuAC : Question Answering in Context
QuAC : Question Answering in Context
Eunsol Choi
He He
Mohit Iyyer
Mark Yatskar
Anuj Kumar
Yejin Choi
Abigail Z. Jacobs
Luke Zettlemoyer
298
878
0
21 Aug 2018
Context-Aware Visual Policy Network for Sequence-Level Image Captioning
Context-Aware Visual Policy Network for Sequence-Level Image Captioning
Daqing Liu
Zhengjun Zha
Hanwang Zhang
Yongdong Zhang
Feng Wu
CLIP
246
104
0
16 Aug 2018
Live Video Comment Generation Based on Surrounding Frames and Live
  Comments
Live Video Comment Generation Based on Surrounding Frames and Live Comments
Damai Dai
VGen
59
0
0
13 Aug 2018
Multimodal Differential Network for Visual Question Generation
Multimodal Differential Network for Visual Question Generation
Badri N. Patro
Sandeep Kumar
V. Kurmi
Vinay P. Namboodiri
201
40
0
12 Aug 2018
Community Regularization of Visually-Grounded Dialog
Community Regularization of Visually-Grounded Dialog
Akshat Agarwal
Swaminathan Gurumurthy
Vasu Sharma
M. Lewis
Katia Sycara
133
10
0
10 Aug 2018
Visual Reasoning with Multi-hop Feature Modulation
Visual Reasoning with Multi-hop Feature Modulation
Florian Strub
Mathieu Seurin
Ethan Perez
H. D. Vries
Jérémie Mary
Philippe Preux
Aaron Courville
Olivier Pietquin
217
28
0
03 Aug 2018
Graph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph Generation
Jianwei Yang
Jiasen Lu
Stefan Lee
Dhruv Batra
Devi Parikh
GNN
310
898
0
01 Aug 2018
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship
  Features
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features
Xu Yang
Hanwang Zhang
Jianfei Cai
205
74
0
01 Aug 2018
Pythia v0.1: the Winning Entry to the VQA Challenge 2018
Pythia v0.1: the Winning Entry to the VQA Challenge 2018
Yu Jiang
Vivek Natarajan
Xinlei Chen
Marcus Rohrbach
Dhruv Batra
Devi Parikh
VLM
297
206
0
26 Jul 2018
Talk the Walk: Navigating New York City through Grounded Dialogue
Talk the Walk: Navigating New York City through Grounded Dialogue
H. D. Vries
Kurt Shuster
Dhruv Batra
Devi Parikh
Jason Weston
Douwe Kiela
326
128
0
09 Jul 2018
Amanuensis: The Programmer's Apprentice
Amanuensis: The Programmer's Apprentice
Thomas Dean
Maurice Chiang
Marcus Gomez
Nate Gruver
Yousef Hindy
...
S. Sanchez
Rohun Saxena
Michael Smith
Lucy Wang
Catherine Wong
84
3
0
29 Jun 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal
  Attention-Based Video Features
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Chiori Hori
Huda AlAmri
Jue Wang
Gordon Wichern
Takaaki Hori
...
Raphael Gontijo-Lopes
Abhishek Das
Irfan Essa
Dhruv Batra
Devi Parikh
VGen
199
130
0
21 Jun 2018
Grounded Textual Entailment
Grounded Textual Entailment
H. Vu
Claudio Greco
A. Erofeeva
Somayeh Jafaritazehjan
Guido M. Linders
Marc Tanti
A. Testoni
Raffaella Bernardi
Albert Gatt
179
31
0
14 Jun 2018
iParaphrasing: Extracting Visually Grounded Paraphrases via an Image
iParaphrasing: Extracting Visually Grounded Paraphrases via an Image
Chenhui Chu
Mayu Otani
Yuta Nakashima
116
8
0
12 Jun 2018
Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7
Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7
Huda AlAmri
Vincent Cartillier
Raphael Gontijo-Lopes
Abhishek Das
Jue Wang
...
Dhruv Batra
Devi Parikh
A. Cherian
Tim K. Marks
Chiori Hori
132
34
0
01 Jun 2018
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
Nayyer Aafaq
Lin Wang
Wen Liu
Syed Zulqarnain Gilani
Mubarak Shah
461
100
0
01 Jun 2018
Visual Referring Expression Recognition: What Do Systems Actually Learn?
Visual Referring Expression Recognition: What Do Systems Actually Learn?
Volkan Cirik
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
122
65
0
30 May 2018
Ask No More: Deciding when to guess in referential visual dialogue
Ask No More: Deciding when to guess in referential visual dialogue
Ravi Shekhar
Tim Baumgärtner
Aashish Venkatesh
Elia Bruni
Raffaella Bernardi
Raquel Fernández
135
22
0
17 May 2018
Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented
  Visual Dialog
Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog
Jiaping Zhang
Tiancheng Zhao
Zhou Yu
131
41
0
08 May 2018
Previous
123...101112
Next