VQA: Visual Question Answering

3 May 2015

Devi Parikh

Papers citing "VQA: Visual Question Answering"

50 / 792 papers shown

Title
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering Aishwarya Agrawal Dhruv Batra Devi Parikh Aniruddha Kembhavi OOD 51 581 0 01 Dec 2017
HoME: a Household Multimodal Environment Simon Brodeur Ethan Perez Ankesh Anand Florian Golemo Luca Herranz-Celotti Florian Strub Jean Rouat Hugo Larochelle Aaron Courville LM&Ro 26 103 0 29 Nov 2017
Convolutional Image Captioning J. Aneja Aditya Deshpande A. Schwing VLM 23 359 0 24 Nov 2017
Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent Zhilin Yang Saizheng Zhang Jack Urbanek Will Feng Alexander H. Miller Arthur Szlam Douwe Kiela Jason Weston 23 25 0 21 Nov 2017
Adversarial Attacks Beyond the Image Space Xiaohui Zeng Chenxi Liu Yu-Siang Wang Weichao Qiu Lingxi Xie Yu-Wing Tai Chi-Keung Tang Alan Yuille AAML 25 145 0 20 Nov 2017
Crowdsourcing Question-Answer Meaning Representations Julian Michael Gabriel Stanovsky Luheng He Ido Dagan Luke Zettlemoyer 19 78 0 16 Nov 2017
Object Referring in Visual Scene with Spoken Language A. Vasudevan Dengxin Dai Luc Van Gool 29 18 0 10 Nov 2017
Active Learning for Visual Question Answering: An Empirical Study Xiaoyu Lin Devi Parikh 36 31 0 06 Nov 2017
Survey of Recent Advances in Visual Question Answering Supriya Pandhre Shagun Sodhani 8 14 0 24 Sep 2017
FiLM: Visual Reasoning with a General Conditioning Layer Ethan Perez Florian Strub H. D. Vries Vincent Dumoulin Aaron Courville FAtt AIMat OffRL AI4CE 70 2,144 0 22 Sep 2017
Visual Question Generation as Dual Task of Visual Question Answering Yikang Li Nan Duan Bolei Zhou Xiao Chu Wanli Ouyang Xiaogang Wang 29 165 0 21 Sep 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning Yang Xian Yingli Tian VLM 21 22 0 15 Sep 2017
Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision Mohamed Elhoseiny Yizhe Zhu Han Zhang Ahmed Elgammal VLM 30 132 0 04 Sep 2017
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation Chuang Gan Yandong Li Haoxiang Li Chen Sun Boqing Gong 19 126 0 15 Aug 2017
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge Damien Teney Peter Anderson Xiaodong He A. Hengel 45 380 0 09 Aug 2017
Weakly Supervised Image Annotation and Segmentation with Objects and Attributes Zhiyuan Shi Yongxin Yang Timothy M. Hospedales Tao Xiang 11 46 0 08 Aug 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator? Marc Tanti Albert Gatt K. Camilleri 16 56 0 07 Aug 2017
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering Zhou Yu Jun-chen Yu Jianping Fan Dacheng Tao 41 663 0 04 Aug 2017
Scene Graph Generation from Objects, Phrases and Region Captions Yikang Li Wanli Ouyang Bolei Zhou Kun Wang Xiaogang Wang 21 499 0 31 Jul 2017
Tensor Fusion Network for Multimodal Sentiment Analysis Amir Zadeh Minghai Chen Soujanya Poria Erik Cambria Louis-Philippe Morency 22 1,198 0 23 Jul 2017
DeepStory: Video Story QA by Deep Embedded Memory Networks Kyung-Min Kim Min-Oh Heo Seongho Choi Byoung-Tak Zhang 19 174 0 04 Jul 2017
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model Jiasen Lu A. Kannan Jianwei Yang Devi Parikh Dhruv Batra BDL 15 136 0 05 Jun 2017
Multimodal Machine Learning: A Survey and Taxonomy T. Baltrušaitis Chaitanya Ahuja Louis-Philippe Morency 13 2,856 0 26 May 2017
Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning Q. Sun Stefan Lee Dhruv Batra BDL 25 43 0 24 May 2017
MUTAN: Multimodal Tucker Fusion for Visual Question Answering H. Ben-younes Rémi Cadène Matthieu Cord Nicolas Thome 44 578 0 18 May 2017
Combating Human Trafficking with Deep Multimodal Models Edmund Tong Amir Zadeh Cara Jones Louis-Philippe Morency 13 51 0 08 May 2017
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data Alexis Conneau Douwe Kiela Holger Schwenk Loïc Barrault Antoine Bordes AI4TS SSL 14 2,093 0 05 May 2017
Weakly-supervised Visual Grounding of Phrases with Linguistic Structures Fanyi Xiao Leonid Sigal Yong Jae Lee 19 138 0 03 May 2017
The Forgettable-Watcher Model for Video Question Answering Hongyang Xue Zhou Zhao Deng Cai 16 9 0 03 May 2017
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning Dipendra Kumar Misra John Langford Yoav Artzi 12 247 0 28 Apr 2017
An Analysis of Action Recognition Datasets for Language and Vision Tasks Spandana Gella Frank Keller ObjD 14 11 0 24 Apr 2017
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets Wei-Lun Chao Hexiang Hu Fei Sha 22 37 0 24 Apr 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering Y. Jang Yale Song Youngjae Yu Youngjin Kim Gunhee Kim 19 545 0 14 Apr 2017
Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks Devinder Kumar Alexander Wong Graham W. Taylor 21 59 0 13 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks Liwei Wang Yin Li Jing-ling Huang Svetlana Lazebnik VLM 27 494 0 11 Apr 2017
It Takes Two to Tango: Towards Theory of AI's Mind Arjun Chandrasekaran Deshraj Yadav Prithvijit Chattopadhyay Viraj Prabhu Devi Parikh 28 53 0 03 Apr 2017
Towards Building Large Scale Multimodal Domain-Aware Conversation Systems Amrita Saha Mitesh Khapra Karthik Sankaranarayanan 21 8 0 01 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation Albert Gatt E. Krahmer LM&MA ELM 18 809 0 29 Mar 2017
An Analysis of Visual Question Answering Algorithms Kushal Kafle Christopher Kanan 19 230 0 28 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation Chenxi Liu Zhe-nan Lin Xiaohui Shen Jimei Yang Xin Lu Alan Yuille EgoV 36 234 0 23 Mar 2017
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning Abhishek Das Satwik Kottur J. M. F. Moura Stefan Lee Dhruv Batra OffRL 31 423 0 20 Mar 2017
Asymmetric Tri-training for Unsupervised Domain Adaptation Kuniaki Saito Yoshitaka Ushiku Tatsuya Harada 26 580 0 27 Feb 2017
Visual Translation Embedding Network for Visual Relation Detection Hanwang Zhang Zawlin Kyaw Shih-Fu Chang Tat-Seng Chua ViT 140 560 0 27 Feb 2017
Task-driven Visual Saliency and Attention-based Visual Question Answering Yuetan Lin Zhangyang Pang Donghui Wang Yueting Zhuang 27 26 0 22 Feb 2017
Person Search with Natural Language Description Shuang Li Tong Xiao Hongsheng Li Bolei Zhou Dayu Yue Xiaogang Wang 19 385 0 19 Feb 2017
Learning Visual N-Grams from Web Data Ang Li Allan Jabri Armand Joulin L. V. D. van der Maaten VLM 12 136 0 29 Dec 2016
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions Peng Wang Qi Wu Chunhua Shen A. Hengel OOD 18 86 0 16 Dec 2016
Attentive Explanations: Justifying Decisions and Pointing to the Evidence Dong Huk Park Lisa Anne Hendricks Zeynep Akata Bernt Schiele Trevor Darrell Marcus Rohrbach AAML 16 79 0 14 Dec 2016
ImageNet pre-trained models with batch normalization Marcel Simon E. Rodner Joachim Denzler VLM SSeg 30 165 0 05 Dec 2016
Who is Mistaken? Benjamin Eysenbach Carl Vondrick Antonio Torralba 19 15 0 04 Dec 2016