v1v2v3v4v5 (latest)

Visual Dialog

26 November 2016

Devi Parikh

Papers citing "Visual Dialog"

50 / 597 papers shown

VLC-BERT: Visual Question Answering with Contextualized Commonsense KnowledgeIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

150

24 Oct 2022

Towards Unifying Reference Expression Generation and ComprehensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

177

24 Oct 2022

McQueen: a Benchmark for Multimodal Conversational Query RewriteConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

116

23 Oct 2022

Extending Phrase Grounding with Pronouns in Visual DialoguesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Min Zhang

193

23 Oct 2022

Learning Point-Language Hierarchical Alignment for 3D Visual Grounding

321

22 Oct 2022

Z-LaVI: Zero-Shot Language Solver Fueled by Visual ImaginationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Wenlin Yao

216

21 Oct 2022

Selective Query-guided Debiasing for Video Corpus Moment RetrievalEuropean Conference on Computer Vision (ECCV), 2022

417

17 Oct 2022

MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot PromptingConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

261

13 Oct 2022

Embodied Referring Expression for Manipulation Question Answering in Interactive EnvironmentIEEE International Conference on Robotics and Automation (ICRA), 2022

Qie Sima

Sinan Tan

Huaping Liu

LM&Ro

157

06 Oct 2022

Vision+X: A Survey on Multimodal Learning in the Light of DataIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Ye Zhu

Yuehua Wu

Andrii Zadaianchuk

Yan Yan

361

05 Oct 2022

Learning to Collocate Visual-Linguistic Neural Modules for Image CaptioningInternational Journal of Computer Vision (IJCV), 2022

Jianfei Cai

273

04 Oct 2022

Towards Explainable 3D Grounded Visual Question Answering: A New Benchmark and Strong Baseline

184

24 Sep 2022

I2DFormer: Learning Image to Document Attention for Zero-Shot Image ClassificationNeural Information Processing Systems (NeurIPS), 2022

Muhammad Ferjad Naeem

Yongqin Xian

Luc Van Gool

F. Tombari

VLM

198

21 Sep 2022

Selecting Stickers in Open-Domain Dialogue through Multitask LearningFindings (Findings), 2022

Zhexin Zhang

Yeshuang Zhu

Zhengcong Fei

Jinchao Zhang

Jie Zhou

145

16 Sep 2022

LAVIS: A Library for Language-Vision Intelligence

Silvio Savarese

334

15 Sep 2022

Interactive Question Answering Systems: Literature ReviewACM Computing Surveys (ACM CSUR), 2022

Giovanni Maria Biancofiore

403

04 Sep 2022

Neuro-Symbolic Visual DialogInternational Conference on Computational Linguistics (COLING), 2022

193

22 Aug 2022

Video Question Answering with Iterative Video-Text Co-TokenizationEuropean Conference on Computer Vision (ECCV), 2022

236

01 Aug 2022

Cross-Modal Causal Relational Reasoning for Event-Level Visual Question AnsweringIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Yang Liu

Guanbin Li

LRM

572

148

26 Jul 2022

Explicit Image Caption EditingEuropean Conference on Computer Vision (ECCV), 2022

183

20 Jul 2022

Deep Sequence Models for Text Classification Tasks

S. S. Abdullahi

Su Yiming

Shamsuddeen Hassan Muhammad

Saminu Mohammad Aliyu

128

18 Jul 2022

Scene Graph for Embodied Exploration in Cluttered Scenario

287

16 Jul 2022

Modeling Non-Cooperative Dialogue: Theoretical and Empirical InsightsTransactions of the Association for Computational Linguistics (TACL), 2022

153

15 Jul 2022

Video Dialog as Conversation about Objects Living in Space-TimeEuropean Conference on Computer Vision (ECCV), 2022

213

08 Jul 2022

Adversarial Robustness of Visual Dialog

Lu Yu

Verena Rieser

AAML

192

06 Jul 2022

Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review

245

02 Jul 2022

Technical Report for CVPR 2022 LOVEU AQTC Challenge

29 Jun 2022

Winning the CVPR'2022 AQTC Challenge: A Two-stage Function-centric Approach

Enhong Chen

235

20 Jun 2022

Multimodal Dialogue State TrackingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

Hung Le

Nancy F. Chen

Guosheng Lin

158

16 Jun 2022

Multimodal Learning with Transformers: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

567

846

13 Jun 2022

VD-PCR: Improving Visual Dialog with Pronoun Coreference ResolutionPattern Recognition (Pattern Recogn.), 2022

184

29 May 2022

Prompt-based Learning for Unpaired Image CaptioningIEEE transactions on multimedia (IEEE TMM), 2022

Yaowei Wang

210

26 May 2022

Multimodal Knowledge Alignment with Reinforcement Learning

...

Prithviraj Ammanabrolu

Yejin Choi

291

25 May 2022

The Dialog Must Go On: Improving Visual Dialog via Generative Self-TrainingComputer Vision and Pattern Recognition (CVPR), 2022

293

25 May 2022

Multimodal Conversational AI: A Survey of Datasets and Approaches

Anirudh S. Sundar

Larry Heck

166

13 May 2022

Learning to Retrieve Videos by Asking QuestionsACM Multimedia (ACM MM), 2022

Avinash Madasu

Junier Oliva

Gedas Bertasius

VGen

319

11 May 2022

Chart Question Answering: State of the Art and Future Directions

Enamul Hoque

P. Kavehzadeh

Ahmed Masry

154

08 May 2022

Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering

279

02 May 2022

UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual DialogComputer Vision and Pattern Recognition (CVPR), 2022

Xin Jiang

Qun Liu

X. Gu

267

01 May 2022

Flamingo: a Visual Language Model for Few-Shot LearningNeural Information Processing Systems (NeurIPS), 2022

Jean-Baptiste Alayrac

...

697

4,901

29 Apr 2022

Supplementing Missing Visions via Dialog for Scene Graph GenerationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Yan Yan

200

23 Apr 2022

Improving Cross-Modal Understanding in Visual Dialog via Contrastive LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Bo Xu

160

15 Apr 2022

Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared KnowledgeAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Brielen Madureira

David Schlangen

153

14 Apr 2022

Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog

164

10 Apr 2022

There Are a Thousand Hamlets in a Thousand People's Eyes: Enhancing Knowledge-grounded Dialogue with Personal MemoryAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

179

06 Apr 2022

Co-VQA : Answering by Interactive Sub Question SequenceFindings (Findings), 2022

165

02 Apr 2022

FindIt: Generalized Localization with Natural Language QueriesEuropean Conference on Computer Vision (ECCV), 2022

212

31 Mar 2022

Image Retrieval from Contextual DescriptionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Siva Reddy

256

29 Mar 2022

Fine-Grained Visual EntailmentEuropean Conference on Computer Vision (ECCV), 2022

Christopher Thomas

Yipeng Zhang

Shih-Fu Chang

298

29 Mar 2022

How do you Converse with an Analytical Chatbot? Revisiting Gricean Maxims for Designing Analytical Conversational BehaviorInternational Conference on Human Factors in Computing Systems (CHI), 2022

V. Setlur

Melanie Tory

151

16 Mar 2022