Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.11501
Cited By
VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering
23 May 2022
Yanan Wang
Michihiro Yasunaga
Hongyu Ren
Shinya Wada
J. Leskovec
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering"
13 / 13 papers shown
Title
DeepMLF: Multimodal language model with learnable tokens for deep fusion in sentiment analysis
Efthymios Georgiou
V. Katsouros
Yannis Avrithis
Alexandros Potamianos
24
1
0
15 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
63
1
0
03 Apr 2025
Predicate Hierarchies Improve Few-Shot State Classification
Emily Jin
Joy Hsu
Jiajun Wu
OffRL
72
0
0
18 Feb 2025
PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation
Ryozo Masukawa
Sanggeon Yun
Yoshiki Yamaguchi
Mohsen Imani
23
0
0
30 Oct 2024
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang
Hongye Fu
Wei Zou
Jinyuan Jia
AAML
23
1
0
28 Mar 2024
VCD: Knowledge Base Guided Visual Commonsense Discovery in Images
Xiangqing Shen
Yurun Song
Siwei Wu
Rui Xia
33
6
0
27 Feb 2024
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
23
13
0
07 Mar 2023
Coarse-to-Fine Reasoning for Visual Question Answering
Binh X. Nguyen
Tuong Khanh Long Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
A. Nguyen
NAI
62
36
0
06 Oct 2021
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
169
402
0
10 Sep 2021
GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question Answering
Weixin Liang
Yanhao Jiang
Zixuan Liu
GNN
39
32
0
20 Apr 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
252
157
0
02 Jan 2021
Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding
Qingxing Cao
Bailin Li
Xiaodan Liang
Keze Wang
Liang Lin
44
36
0
14 Dec 2020
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,465
0
06 Jun 2016
1