ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXivPDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 792 papers shown
Title
RL-CSDia: Representation Learning of Computer Science Diagrams
RL-CSDia: Representation Learning of Computer Science Diagrams
Shaowei Wang
LingLing Zhang
Xuan Luo
Yi Yang
Xin Hu
Jun Liu
3DV
13
2
0
10 Mar 2021
Selective Replay Enhances Learning in Online Continual Analogical
  Reasoning
Selective Replay Enhances Learning in Online Continual Analogical Reasoning
Tyler L. Hayes
Christopher Kanan
CLL
16
20
0
06 Mar 2021
Causal Attention for Vision-Language Tasks
Causal Attention for Vision-Language Tasks
Xu Yang
Hanwang Zhang
Guojun Qi
Jianfei Cai
CML
23
148
0
05 Mar 2021
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual
  Machine Learning
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
197
310
0
02 Mar 2021
MultiSubs: A Large-scale Multimodal and Multilingual Dataset
MultiSubs: A Large-scale Multimodal and Multilingual Dataset
Josiah Wang
Pranava Madhyastha
J. Figueiredo
Chiraag Lala
Lucia Specia
VGen
14
11
0
02 Mar 2021
Learning Reasoning Paths over Semantic Graphs for Video-grounded
  Dialogues
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues
Hung Le
Nancy F. Chen
S. Hoi
31
14
0
01 Mar 2021
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical
  Visual Question Answering
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering
Bo Liu
Li-Ming Zhan
Li Xu
Lin Ma
Y. Yang
Xiao-Ming Wu
19
234
0
18 Feb 2021
A Metamodel and Framework for Artificial General Intelligence From
  Theory to Practice
A Metamodel and Framework for Artificial General Intelligence From Theory to Practice
Hugo Latapie
Özkan Kiliç
Gaowen Liu
Yan Yan
Ramana Rao Kompella
Pei Wang
K. Thórisson
Adam Lawrence
Yuhong Sun
Jayanth Srinivasa
AI4CE
20
9
0
11 Feb 2021
Answer Questions with Right Image Regions: A Visual Attention
  Regularization Approach
Answer Questions with Right Image Regions: A Visual Attention Regularization Approach
Y. Liu
Yangyang Guo
Jianhua Yin
Xuemeng Song
Weifeng Liu
Liqiang Nie
24
28
0
03 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
258
346
0
01 Feb 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
75
110
0
31 Jan 2021
Latent Variable Models for Visual Question Answering
Latent Variable Models for Visual Question Answering
Zixu Wang
Yishu Miao
Lucia Specia
25
5
0
16 Jan 2021
Explainability of deep vision-based autonomous driving systems: Review
  and challenges
Explainability of deep vision-based autonomous driving systems: Review and challenges
Éloi Zablocki
H. Ben-younes
P. Pérez
Matthieu Cord
XAI
32
169
0
13 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,428
0
04 Jan 2021
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense
  Generation
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Yiran Xing
Z. Shi
Zhao Meng
Gerhard Lakemeyer
Yunpu Ma
Roger Wattenhofer
VLM
64
40
0
02 Jan 2021
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual
  Contexts
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts
Yuxian Meng
Shuhe Wang
Qinghong Han
Xiaofei Sun
Fei Wu
Rui Yan
Jiwei Li
16
28
0
30 Dec 2020
MELINDA: A Multimodal Dataset for Biomedical Experiment Method
  Classification
MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification
Te-Lin Wu
Shikhar Singh
S. Paul
Gully A. Burns
Nanyun Peng
22
18
0
16 Dec 2020
Look Before you Speak: Visually Contextualized Utterances
Look Before you Speak: Visually Contextualized Utterances
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
19
66
0
10 Dec 2020
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps
Qi Zhu
Chenyu Gao
Peng Wang
Qi Wu
12
54
0
09 Dec 2020
CASTing Your Model: Learning to Localize Improves Self-Supervised
  Representations
CASTing Your Model: Learning to Localize Improves Self-Supervised Representations
Ramprasaath R. Selvaraju
Karan Desai
Justin Johnson
Nikhil Naik
SSL
14
79
0
08 Dec 2020
WeaQA: Weak Supervision via Captions for Visual Question Answering
WeaQA: Weak Supervision via Captions for Visual Question Answering
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
17
34
0
04 Dec 2020
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework
  of Vision-and-Language BERTs
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs
Emanuele Bugliarello
Ryan Cotterell
Naoaki Okazaki
Desmond Elliott
24
119
0
30 Nov 2020
Multi-document Summarization via Deep Learning Techniques: A Survey
Multi-document Summarization via Deep Learning Techniques: A Survey
Congbo Ma
W. Zhang
Mingyu Guo
Hu Wang
Quan Z. Sheng
13
125
0
10 Nov 2020
An Improved Attention for Visual Question Answering
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
13
42
0
04 Nov 2020
MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual
  Question Answering
MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering
Aisha Urooj Khan
Amir Mazaheri
N. Lobo
M. Shah
24
56
0
27 Oct 2020
Deep Reinforcement Learning with Stacked Hierarchical Attention for
  Text-based Games
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games
Yunqiu Xu
Meng Fang
Ling-Hao Chen
Yali Du
Joey Tianyi Zhou
Chengqi Zhang
OffRL
23
44
0
22 Oct 2020
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
  Functional Entropies
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Itai Gat
Idan Schwartz
A. Schwing
Tamir Hazan
53
89
0
21 Oct 2020
The Open Catalyst 2020 (OC20) Dataset and Community Challenges
The Open Catalyst 2020 (OC20) Dataset and Community Challenges
L. Chanussot
Abhishek Das
Siddharth Goyal
Thibaut Lavril
Muhammed Shuaibi
...
Brandon M. Wood
Junwoong Yoon
Devi Parikh
C. L. Zitnick
Zachary W. Ulissi
221
503
0
20 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei-Neng Chen
Weiping Wang
Li Liu
M. Lew
VLM
110
31
0
16 Oct 2020
Neural Databases
Neural Databases
James Thorne
Majid Yazdani
Marzieh Saeidi
Fabrizio Silvestri
Sebastian Riedel
A. Halevy
NAI
26
9
0
14 Oct 2020
Fine-Grained Grounding for Multimodal Speech Recognition
Fine-Grained Grounding for Multimodal Speech Recognition
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
19
11
0
05 Oct 2020
Graph-based Heuristic Search for Module Selection Procedure in Neural
  Module Network
Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network
Yuxuan Wu
Hideki Nakayama
GNN
23
3
0
30 Sep 2020
Trustworthy Convolutional Neural Networks: A Gradient Penalized-based
  Approach
Trustworthy Convolutional Neural Networks: A Gradient Penalized-based Approach
Nicholas F Halliwell
Freddy Lecue
FAtt
17
9
0
29 Sep 2020
Dataset Cartography: Mapping and Diagnosing Datasets with Training
  Dynamics
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
30
429
0
22 Sep 2020
Commands 4 Autonomous Vehicles (C4AV) Workshop Summary
Commands 4 Autonomous Vehicles (C4AV) Workshop Summary
Thierry Deruyttere
Simon Vandenhende
Dusan Grujicic
Yu Liu
Luc Van Gool
Matthew Blaschko
Tinne Tuytelaars
Marie-Francine Moens
22
6
0
18 Sep 2020
Multi-Task Learning with Deep Neural Networks: A Survey
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
13
608
0
10 Sep 2020
A Comparison of Pre-trained Vision-and-Language Models for Multimodal
  Representation Learning across Medical Images and Reports
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li
Hanyin Wang
Yuan Luo
6
63
0
03 Sep 2020
A Survey of Evaluation Metrics Used for NLG Systems
A Survey of Evaluation Metrics Used for NLG Systems
Ananya B. Sai
Akash Kumar Mohankumar
Mitesh M. Khapra
ELM
25
228
0
27 Aug 2020
Word meaning in minds and machines
Word meaning in minds and machines
Brenden Lake
G. Murphy
NAI
15
117
0
04 Aug 2020
AiR: Attention with Reasoning Capability
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
13
36
0
28 Jul 2020
SBAT: Video Captioning with Sparse Boundary-Aware Transformer
SBAT: Video Captioning with Sparse Boundary-Aware Transformer
Tao Jin
Siyu Huang
Ming Chen
Yingming Li
Zhongfei Zhang
30
52
0
23 Jul 2020
Referring Expression Comprehension: A Survey of Methods and Datasets
Referring Expression Comprehension: A Survey of Methods and Datasets
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
42
93
0
19 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal
  Shuffled Transformers
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Peng Gao
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Yongfeng Zhang
Hongsheng Li
A. Cherian
19
11
0
08 Jul 2020
Targeting the Benchmark: On Methodology in Current Natural Language
  Processing Research
Targeting the Benchmark: On Methodology in Current Natural Language Processing Research
David Schlangen
19
57
0
07 Jul 2020
Drug discovery with explainable artificial intelligence
Drug discovery with explainable artificial intelligence
José Jiménez-Luna
F. Grisoni
G. Schneider
25
625
0
01 Jul 2020
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through
  Scene Graph
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Fei Yu
Jiji Tang
Weichong Yin
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
11
375
0
30 Jun 2020
A causal view of compositional zero-shot recognition
A causal view of compositional zero-shot recognition
Y. Atzmon
Felix Kreuk
Uri Shalit
Gal Chechik
OCL
BDL
CML
47
117
0
25 Jun 2020
Recurrent Relational Memory Network for Unsupervised Image Captioning
Recurrent Relational Memory Network for Unsupervised Image Captioning
Dan Guo
Yang Wang
Peipei Song
Meng Wang
GAN
17
40
0
24 Jun 2020
A generalizable saliency map-based interpretation of model outcome
A generalizable saliency map-based interpretation of model outcome
Shailja Thakur
S. Fischmeister
AAML
FAtt
MILM
17
2
0
16 Jun 2020
A Study of Compositional Generalization in Neural Models
A Study of Compositional Generalization in Neural Models
Tim Klinger
D. Adjodah
Vincent Marois
Joshua Joseph
Matthew D Riemer
Alex Pentland
Murray Campbell
CoGe
NAI
10
12
0
16 Jun 2020
Previous
123...101112...141516
Next