v1v2v3 (latest)

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

2 December 2016

Devi Parikh

Papers citing "Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"

50 / 2,273 papers shown

Title
The meaning of "most" for visual question answering models A. Kuhnle Ann A. Copestake 130 4 0 31 Dec 2018
Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering Zhuoqian Yang Zengchang Qin Jing Yu Yue Hu GNN 127 16 0 23 Dec 2018
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context T. Nguyen Shikhar Sharma Hannes Schulz Layla El Asri 122 34 0 17 Dec 2018
Visual Social Relationship Recognition Junnan Li Yongkang Wong Qi Zhao Mohan Kankanhalli 111 28 0 13 Dec 2018
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering Shiyang Feng Zhengkai Jiang Haoxuan You Pan Lu Steven C. H. Hoi Xiaogang Wang Jiaming Song AIMat 428 393 0 13 Dec 2018
Learning Representations of Sets through Optimized Permutations Yan Zhang Jonathon S. Hare Adam Prugel-Bennett SSL 159 28 0 10 Dec 2018
Learning to Compose Dynamic Tree Structures for Visual Contexts Kaihua Tang Hanwang Zhang Baoyuan Wu Tong Lu Wen Liu 249 550 0 05 Dec 2018
Explainable and Explicit Visual Reasoning over Scene Graphs Jiaxin Shi Hanwang Zhang Juan-Zi Li OCL 418 250 0 05 Dec 2018
Learning to Explain with Complemental Examples Atsushi Kanehira Tatsuya Harada 187 43 0 04 Dec 2018
Multimodal Explanations by Predicting Counterfactuality in Videos Atsushi Kanehira Kentaro Takemoto S. Inayoshi Tatsuya Harada 123 41 0 04 Dec 2018
Multi-task Learning of Hierarchical Vision-Language Representation Duy-Kien Nguyen Takayuki Okatani 228 56 0 03 Dec 2018
Learning to Caption Images through a Lifetime by Asking Questions Tingke Shen Amlan Kar Sanja Fidler 222 31 0 01 Dec 2018
From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts M. Farazi Salman H Khan Nick Barnes 132 13 0 30 Nov 2018
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments Howard Chen Alane Suhr Dipendra Kumar Misra Noah Snavely Yoav Artzi 481 435 0 29 Nov 2018
From Recognition to Cognition: Visual Commonsense Reasoning Rowan Zellers Yonatan Bisk Ali Farhadi Yejin Choi LRM BDL OCL ReLM 588 984 0 27 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning Ning Xie Farley Lai Derek Doran Asim Kadav 121 59 0 26 Nov 2018
VQA with no questions-answers trainingComputer Vision and Pattern Recognition (CVPR), 2018 B. Vatashsky S. Ullman 208 13 0 20 Nov 2018
Explicit Bias Discovery in Visual Question Answering ModelsComputer Vision and Pattern Recognition (CVPR), 2018 Varun Manjunatha Nirat Saini L. Davis CML FAtt 180 96 0 19 Nov 2018
On transfer learning using a MAC model variant Vincent Marois T. S. Jayram V. Albouy Tomasz Kornuta Younes Bouhadjar A. Ozcan DRL 194 9 0 15 Nov 2018
Holistic Multi-modal Memory Network for Movie Question Answering Anran Wang Anh Tuan Luu Chuan-Sheng Foo Erik Cambria Yi Tay V. Chandrasekhar 171 20 0 12 Nov 2018
Shifting the Baseline: Single Modality Performance on Visual Navigation & QA Jesse Thomason Daniel Gordon Yonatan Bisk 254 80 0 01 Nov 2018
A Corpus for Reasoning About Natural Language Grounded in Photographs Alane Suhr Stephanie Zhou Ally Zhang Iris Zhang Huajun Bai Yoav Artzi LRM 417 670 0 01 Nov 2018
TallyQA: Answering Complex Counting Questions Manoj Acharya Kushal Kafle Christopher Kanan 220 162 0 29 Oct 2018
Do Explanations make VQA Models more Predictable to a Human? Arjun Chandrasekaran Viraj Prabhu Deshraj Yadav Prithvijit Chattopadhyay Devi Parikh FAtt 226 102 0 29 Oct 2018
Understand, Compose and Respond - Answering Visual Questions by a Composition of Abstract Procedures B. Vatashsky S. Ullman CoGe 126 2 0 25 Oct 2018
Knowing Where to Look? Analysis on Attention of Visual Question Answering System Wei Li Zehuan Yuan Xiangzhong Fang Changhu Wang 94 8 0 09 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization S. Ramakrishnan Aishwarya Agrawal Stefan Lee AAML 221 259 0 08 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding Kexin Yi Jiajun Wu Chuang Gan Antonio Torralba Pushmeet Kohli J. Tenenbaum NAI 286 654 0 04 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering Hyeonwoo Noh Taehoon Kim Jonghwan Mun Bohyung Han 192 17 0 03 Oct 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA Shailza Jolly Sandro Pezzelle T. Klein Andreas Dengel Moin Nabi 88 2 0 12 Sep 2018
How clever is the FiLM model, and how clever can it be? A. Kuhnle Huiyuan Xie Ann A. Copestake 151 7 0 09 Sep 2018
What If We Simply Swap the Two Text Fragments? A Straightforward yet Effective Way to Test the Robustness of Methods to Confounding Signals in Nature Language Inference Tasks Haohan Wang Da-You Sun Eric Xing 210 42 0 07 Sep 2018
Visual Coreference Resolution in Visual Dialog using Neural Module Networks Satwik Kottur José M. F. Moura Devi Parikh Dhruv Batra Marcus Rohrbach 186 168 0 06 Sep 2018
Interpretable Visual Question Answering by Reasoning on Dependency Trees Qingxing Cao Bailin Li Xiaodan Liang Liang Lin 176 56 0 06 Sep 2018
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering Medhini Narasimhan Alex Schwing 175 111 0 04 Sep 2018
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes Semih Yagcioglu Aykut Erdem Erkut Erdem Nazli Ikizler-Cinbis CoGe 157 184 0 04 Sep 2018
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers Dongxiang Zhang Lei Wang Nuo Xu B. Dai Heng Tao Shen ReLM AIMat 168 140 0 22 Aug 2018
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference Rowan Zellers Yonatan Bisk Roy Schwartz Yejin Choi 386 757 0 16 Aug 2018
How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks Divyansh Kaushik Zachary Chase Lipton ELM 240 237 0 14 Aug 2018
Community Regularization of Visually-Grounded Dialog Akshat Agarwal Swaminathan Gurumurthy Vasu Sharma M. Lewis Katia Sycara 133 10 0 10 Aug 2018
A Joint Sequence Fusion Model for Video Question Answering and Retrieval Youngjae Yu Jongseok Kim Gunhee Kim 228 381 0 07 Aug 2018
Learning Visual Question Answering by Bootstrapping Hard Attention Mateusz Malinowski Carl Doersch Adam Santoro Peter W. Battaglia OOD 262 98 0 01 Aug 2018
Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining Yundong Zhang Juan Carlos Niebles Á. Soto 181 70 0 01 Aug 2018
Pythia v0.1: the Winning Entry to the VQA Challenge 2018 Yu Jiang Vivek Natarajan Xinlei Chen Marcus Rohrbach Dhruv Batra Devi Parikh VLM 297 207 0 26 Jul 2018
Explainable Neural Computation via Stack Neural Module Networks Ronghang Hu Jacob Andreas Trevor Darrell Kate Saenko LRM OCL 315 204 0 23 Jul 2018
Question Relevance in Visual Question Answering Prakruthi Prabhakar Nitish Kulkarni Linghao Zhang 90 7 0 23 Jul 2018
Dynamic Multimodal Instance Segmentation guided by natural language queriesEuropean Conference on Computer Vision (ECCV), 2018 Edgar Margffoy-Tuay Juan C. Pérez Emilio Botero Pablo Arbelaez 256 187 0 06 Jul 2018
Collaborative Annotation of Semantic Objects in Images with Multi-granularity SupervisionsACM Multimedia (ACM MM), 2018 Lishi Zhang Chenghan Fu Jia Li 110 8 0 27 Jun 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features Chiori Hori Huda AlAmri Jue Wang Gordon Wichern Takaaki Hori ... Raphael Gontijo-Lopes Abhishek Das Irfan Essa Dhruv Batra Devi Parikh VGen 199 130 0 21 Jun 2018
Learning Conditioned Graph Structures for Interpretable Visual Question Answering Will Norcliffe-Brown Efstathios Vafeias Sarah Parisot GNN 257 250 0 19 Jun 2018