Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1606.05433
Cited By
v1
v2
v3
v4 (latest)
FVQA: Fact-based Visual Question Answering
17 June 2016
Peng Wang
Qi Wu
Chunhua Shen
Anton van den Hengel
A. Dick
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FVQA: Fact-based Visual Question Answering"
50 / 241 papers shown
SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jielin Qiu
Andrea Madotto
Mohammad Kachuee
Paul A. Crook
Yongjun Xu
Xin Luna Dong
Christos Faloutsos
Lei Li
Babak Damavandi
Seungwhan Moon
229
16
0
07 Mar 2024
CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments
Savitha Sam Abraham
Marjan Alirezaie
Luc de Raedt
285
1
0
05 Mar 2024
Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment
Yunxin Li
Xinyu Chen
Baotian Hu
Haoyuan Shi
Min Zhang
184
7
0
21 Feb 2024
ConVQG: Contrastive Visual Question Generation with Multimodal Guidance
Li Mi
Syrielle Montariol
J. Castillo-Navarro
Xianjie Dai
Antoine Bosselut
D. Tuia
177
7
0
20 Feb 2024
Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering
Hao-Heng Chen
Qinggang Zhang
Huachi Zhou
Daochen Zha
Pai Zheng
Xiao Huang
231
19
0
20 Feb 2024
AI, Meet Human: Learning Paradigms for Hybrid Decision Making Systems
Clara Punzi
Roberto Pellungrini
Mattia Setzu
F. Giannotti
D. Pedreschi
389
8
0
09 Feb 2024
Knowledge Generation for Zero-shot Knowledge-based VQA
Rui Cao
Jing Jiang
129
9
0
04 Feb 2024
GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering
Ziyu Ma
Shutao Li
Bin Sun
Jianfei Cai
Zuxiang Long
Fuyan Ma
259
8
0
04 Feb 2024
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibi Wang
Weifeng Ge
LRM
443
9
0
19 Jan 2024
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining
AAAI Conference on Artificial Intelligence (AAAI), 2024
Minjun Kim
Seungwoo Song
Youhan Lee
Haneol Jang
Kyungtae Lim
207
9
0
12 Jan 2024
Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering
Chengxiang Yin
Zhengping Che
Kun Wu
Zhiyuan Xu
Jian Tang
186
1
0
20 Dec 2023
A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering
Yunxin Li
Longyue Wang
Baotian Hu
Xinyu Chen
Wanqi Zhong
Chenyang Lyu
Wei Wang
Min Zhang
ELM
208
26
0
13 Nov 2023
Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yichi Zhang
Zhuo Chen
Yin Fang
Yanxi Lu
Fangming Li
Wen Zhang
Hua-zeng Chen
347
50
0
11 Nov 2023
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities
Information Fusion (Inf. Fusion), 2023
Md Farhan Ishmam
Md Sakib Hossain Shovon
M. F. Mridha
Nilanjan Dey
402
71
0
01 Nov 2023
A Simple Baseline for Knowledge-Based Visual Question Answering
Alexandros Xenos
Themos Stafylakis
Ioannis Patras
Georgios Tzimiropoulos
348
16
0
20 Oct 2023
UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Yanyang Guo
Fangkai Jiao
Zhiqi Shen
Liqiang Nie
Mohan S. Kankanhalli
MLLM
395
13
0
17 Oct 2023
Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering
Neural Information Processing Systems (NeurIPS), 2023
Weizhe Lin
Jinghong Chen
Jingbiao Mei
Alexandru Coca
Bill Byrne
280
72
0
29 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
400
5
0
05 Sep 2023
CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection
IEEE International Conference on Computer Vision (ICCV), 2023
Jiajin Tang
Ge Zheng
Jingyi Yu
Sibei Yang
ObjD
221
39
0
03 Sep 2023
Diagnosing Human-object Interaction Detectors
International Journal of Computer Vision (IJCV), 2023
Fangrui Zhu
Yiming Xie
Weidi Xie
Huaizu Jiang
218
11
0
16 Aug 2023
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
333
45
0
21 Jul 2023
Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question Answering
International Conference on the Theory of Information Retrieval (ICTIR), 2023
Alireza Salemi
Mahta Rafiee
Hamed Zamani
173
13
0
28 Jun 2023
Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories
IEEE International Conference on Computer Vision (ICCV), 2023
Thomas Mensink
J. Uijlings
Lluis Castrejon
A. Goel
Felipe Cadar
Howard Zhou
Fei Sha
A. Araújo
V. Ferrari
281
82
0
15 Jun 2023
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Difei Gao
Lei Ji
Luowei Zhou
Kevin Lin
Joya Chen
Zihan Fan
Mike Zheng Shou
MLLM
422
108
0
14 Jun 2023
End-to-end Knowledge Retrieval with Multi-modal Queries
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Man Luo
Zhiyuan Fang
Tejas Gokhale
Yezhou Yang
Chitta Baral
VLM
226
30
0
01 Jun 2023
Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Xingyu Fu
Shenmin Zhang
Gukyeong Kwon
Pramuditha Perera
Henghui Zhu
...
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Dan Roth
Bing Xiang
198
31
0
30 May 2023
KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhiwei Jia
P. Narayana
Arjun Reddy Akula
G. Pruthi
Haoran Su
Sugato Basu
Varun Jampani
VLM
OffRL
217
7
0
28 May 2023
i-Code Studio: A Configurable and Composable Framework for Integrative AI
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yuwei Fang
Mahmoud Khademi
Chenguang Zhu
Ziyi Yang
Reid Pryzant
...
Yao Qian
Takuya Yoshioka
Lu Yuan
Michael Zeng
Xuedong Huang
185
2
0
23 May 2023
Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature
Ana Claudia Akemi Matsuki de Faria
Felype de Castro Bastos
Jose Victor Nogueira Alves da Silva
Vitor Lopes Fabris
Valeska Uchôa
Décio Gonccalves de Aguiar Neto
C. F. G. Santos
264
27
0
18 May 2023
Combo of Thinking and Observing for Outside-Knowledge VQA
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Q. Si
Yuchen Mo
Zheng Lin
Huishan Ji
Weiping Wang
181
20
0
10 May 2023
NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
Phillip Howard
Junlin Wang
Vasudev Lal
Gadi Singer
Yejin Choi
Swabha Swayamdipta
185
11
0
08 May 2023
Visual Reasoning: from State to Transformation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Xin Hong
Yanyan Lan
Liang Pang
Jiafeng Guo
Xueqi Cheng
LRM
175
4
0
02 May 2023
A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Alireza Salemi
Juan Altmayer Pizzorno
Hamed Zamani
132
24
0
26 Apr 2023
FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering
Findings (Findings), 2023
Weizhe Lin
Zhilin Wang
Bill Byrne
AAML
191
6
0
19 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
The Visual Computer (TVC), 2023
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
323
32
0
07 Mar 2023
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Computer Vision and Pattern Recognition (CVPR), 2023
Kan Chen
Xiangqian Wu
CoGe
167
19
0
05 Mar 2023
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on Tasks and Challenges
Maria Lymperaiou
Giorgos Stamou
VLM
236
5
0
04 Mar 2023
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Zhou Yu
Xuecheng Ouyang
Zhenwei Shao
Mei Wang
Jun Yu
MLLM
450
19
0
03 Mar 2023
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yang Chen
Hexiang Hu
Yi Luan
Haitian Sun
Soravit Changpinyo
Alan Ritter
Ming-Wei Chang
614
150
0
23 Feb 2023
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey
Artificial Intelligence Review (AIR), 2023
Kunlin Wang
Zi Wang
Zhang Li
Ang Su
Xichao Teng
Minhao Liu
Qifeng Yu
Qifeng Yu
ObjD
686
28
0
21 Feb 2023
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Machine Intelligence Research (MIR), 2023
Tianlin Li
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CE
VLM
477
272
0
20 Feb 2023
Benchmarks for Automated Commonsense Reasoning: A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
E. Davis
ELM
LRM
299
80
0
09 Feb 2023
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution Generalization of VQA Models
Ali Borji
CoGe
140
2
0
28 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
275
54
0
12 Jan 2023
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges
R. Zakari
Jim Wilson Owusu
Hailin Wang
Ke Qin
Zaharaddeen Karami Lawal
Yue-hong Dong
LRM
183
18
0
26 Dec 2022
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Computer Vision and Pattern Recognition (CVPR), 2022
Ziniu Hu
Ahmet Iscen
Chen Sun
Zirui Wang
Kai-Wei Chang
Luke Huan
Cordelia Schmid
David A. Ross
Alireza Fathi
RALM
VLM
345
139
0
10 Dec 2022
Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles
Computer Vision and Pattern Recognition (CVPR), 2022
Shuquan Ye
Yujia Xie
Dongdong Chen
Yichong Xu
Lu Yuan
Chenguang Zhu
Jing Liao
VLM
136
18
0
29 Nov 2022
A survey on knowledge-enhanced multimodal learning
Artificial Intelligence Review (Artif Intell Rev), 2022
Maria Lymperaiou
Giorgos Stamou
475
23
0
19 Nov 2022
Towards Reasoning-Aware Explainable VQA
Rakesh Vaideeswaran
Feng Gao
Abhinav Mathur
Govind Thattai
LRM
202
4
0
09 Nov 2022
VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Sahithya Ravi
Aditya Chinchure
Leonid Sigal
Renjie Liao
Vered Shwartz
150
44
0
24 Oct 2022
Previous
1
2
3
4
5
Next
Page 2 of 5