ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.05433
  4. Cited By
FVQA: Fact-based Visual Question Answering
v1v2v3v4 (latest)

FVQA: Fact-based Visual Question Answering

17 June 2016
Peng Wang
Qi Wu
Chunhua Shen
Anton van den Hengel
A. Dick
    CoGe
ArXiv (abs)PDFHTML

Papers citing "FVQA: Fact-based Visual Question Answering"

50 / 241 papers shown
SnapNTell: Enhancing Entity-Centric Visual Question Answering with
  Retrieval Augmented Multimodal LLM
SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLMConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jielin Qiu
Andrea Madotto
Mohammad Kachuee
Paul A. Crook
Yongjun Xu
Xin Luna Dong
Christos Faloutsos
Lei Li
Babak Damavandi
Seungwhan Moon
229
16
0
07 Mar 2024
CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially
  Observable Environments
CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments
Savitha Sam Abraham
Marjan Alirezaie
Luc de Raedt
285
1
0
05 Mar 2024
Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension
  with Enhanced Visual Knowledge Alignment
Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment
Yunxin Li
Xinyu Chen
Baotian Hu
Haoyuan Shi
Min Zhang
184
7
0
21 Feb 2024
ConVQG: Contrastive Visual Question Generation with Multimodal Guidance
ConVQG: Contrastive Visual Question Generation with Multimodal Guidance
Li Mi
Syrielle Montariol
J. Castillo-Navarro
Xianjie Dai
Antoine Bosselut
D. Tuia
177
7
0
20 Feb 2024
Modality-Aware Integration with Large Language Models for
  Knowledge-based Visual Question Answering
Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering
Hao-Heng Chen
Qinggang Zhang
Huachi Zhou
Daochen Zha
Pai Zheng
Xiao Huang
231
19
0
20 Feb 2024
AI, Meet Human: Learning Paradigms for Hybrid Decision Making Systems
AI, Meet Human: Learning Paradigms for Hybrid Decision Making Systems
Clara Punzi
Roberto Pellungrini
Mattia Setzu
F. Giannotti
D. Pedreschi
389
8
0
09 Feb 2024
Knowledge Generation for Zero-shot Knowledge-based VQA
Knowledge Generation for Zero-shot Knowledge-based VQA
Rui Cao
Jing Jiang
129
9
0
04 Feb 2024
GeReA: Question-Aware Prompt Captions for Knowledge-based Visual
  Question Answering
GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering
Ziyu Ma
Shutao Li
Bin Sun
Jianfei Cai
Zuxiang Long
Fuyan Ma
259
8
0
04 Feb 2024
Q&A Prompts: Discovering Rich Visual Clues through Mining
  Question-Answer Prompts for VQA requiring Diverse World Knowledge
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibi Wang
Weifeng Ge
LRM
443
9
0
19 Jan 2024
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via
  Graph Representation Pretraining
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation PretrainingAAAI Conference on Artificial Intelligence (AAAI), 2024
Minjun Kim
Seungwoo Song
Youhan Lee
Haneol Jang
Kyungtae Lim
207
9
0
12 Jan 2024
Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual
  Question Answering
Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering
Chengxiang Yin
Zhengping Che
Kun Wu
Zhiyuan Xu
Jian Tang
186
1
0
20 Dec 2023
A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual
  Question Answering
A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering
Yunxin Li
Longyue Wang
Baotian Hu
Xinyu Chen
Wanqi Zhong
Chenyang Lyu
Wei Wang
Min Zhang
ELM
208
26
0
13 Nov 2023
Knowledgeable Preference Alignment for LLMs in Domain-specific Question
  Answering
Knowledgeable Preference Alignment for LLMs in Domain-specific Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yichi Zhang
Zhuo Chen
Yin Fang
Yanxi Lu
Fangming Li
Wen Zhang
Hua-zeng Chen
347
50
0
11 Nov 2023
From Image to Language: A Critical Analysis of Visual Question Answering
  (VQA) Approaches, Challenges, and Opportunities
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and OpportunitiesInformation Fusion (Inf. Fusion), 2023
Md Farhan Ishmam
Md Sakib Hossain Shovon
M. F. Mridha
Nilanjan Dey
402
71
0
01 Nov 2023
A Simple Baseline for Knowledge-Based Visual Question Answering
A Simple Baseline for Knowledge-Based Visual Question Answering
Alexandros Xenos
Themos Stafylakis
Ioannis Patras
Georgios Tzimiropoulos
348
16
0
20 Oct 2023
UNK-VQA: A Dataset and a Probe into the Abstention Ability of
  Multi-modal Large Models
UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Yanyang Guo
Fangkai Jiao
Zhiqi Shen
Liqiang Nie
Mohan S. Kankanhalli
MLLM
395
13
0
17 Oct 2023
Fine-grained Late-interaction Multi-modal Retrieval for Retrieval
  Augmented Visual Question Answering
Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question AnsweringNeural Information Processing Systems (NeurIPS), 2023
Weizhe Lin
Jinghong Chen
Jingbiao Mei
Alexandru Coca
Bill Byrne
280
72
0
29 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
400
5
0
05 Sep 2023
CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection
CoTDet: Affordance Knowledge Prompting for Task Driven Object DetectionIEEE International Conference on Computer Vision (ICCV), 2023
Jiajin Tang
Ge Zheng
Jingyi Yu
Sibei Yang
ObjD
221
39
0
03 Sep 2023
Diagnosing Human-object Interaction Detectors
Diagnosing Human-object Interaction DetectorsInternational Journal of Computer Vision (IJCV), 2023
Fangrui Zhu
Yiming Xie
Weidi Xie
Huaizu Jiang
218
11
0
16 Aug 2023
Robust Visual Question Answering: Datasets, Methods, and Future
  Challenges
Robust Visual Question Answering: Datasets, Methods, and Future ChallengesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
333
45
0
21 Jul 2023
Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual
  Question Answering
Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question AnsweringInternational Conference on the Theory of Information Retrieval (ICTIR), 2023
Alireza Salemi
Mahta Rafiee
Hamed Zamani
173
13
0
28 Jun 2023
Encyclopedic VQA: Visual questions about detailed properties of
  fine-grained categories
Encyclopedic VQA: Visual questions about detailed properties of fine-grained categoriesIEEE International Conference on Computer Vision (ICCV), 2023
Thomas Mensink
J. Uijlings
Lluis Castrejon
A. Goel
Felipe Cadar
Howard Zhou
Fei Sha
A. Araújo
V. Ferrari
281
82
0
15 Jun 2023
AssistGPT: A General Multi-modal Assistant that can Plan, Execute,
  Inspect, and Learn
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Difei Gao
Lei Ji
Luowei Zhou
Kevin Lin
Joya Chen
Zihan Fan
Mike Zheng Shou
MLLM
422
108
0
14 Jun 2023
End-to-end Knowledge Retrieval with Multi-modal Queries
End-to-end Knowledge Retrieval with Multi-modal QueriesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Man Luo
Zhiyuan Fang
Tejas Gokhale
Yezhou Yang
Chitta Baral
VLM
226
30
0
01 Jun 2023
Generate then Select: Open-ended Visual Question Answering Guided by
  World Knowledge
Generate then Select: Open-ended Visual Question Answering Guided by World KnowledgeAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Xingyu Fu
Shenmin Zhang
Gukyeong Kwon
Pramuditha Perera
Henghui Zhu
...
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Dan Roth
Bing Xiang
198
31
0
30 May 2023
KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature
  Adaptation of Vision-Language Models
KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhiwei Jia
P. Narayana
Arjun Reddy Akula
G. Pruthi
Haoran Su
Sugato Basu
Varun Jampani
VLMOffRL
217
7
0
28 May 2023
i-Code Studio: A Configurable and Composable Framework for Integrative
  AI
i-Code Studio: A Configurable and Composable Framework for Integrative AIConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yuwei Fang
Mahmoud Khademi
Chenguang Zhu
Ziyi Yang
Reid Pryzant
...
Yao Qian
Takuya Yoshioka
Lu Yuan
Michael Zeng
Xuedong Huang
185
2
0
23 May 2023
Visual Question Answering: A Survey on Techniques and Common Trends in
  Recent Literature
Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature
Ana Claudia Akemi Matsuki de Faria
Felype de Castro Bastos
Jose Victor Nogueira Alves da Silva
Vitor Lopes Fabris
Valeska Uchôa
Décio Gonccalves de Aguiar Neto
C. F. G. Santos
264
27
0
18 May 2023
Combo of Thinking and Observing for Outside-Knowledge VQA
Combo of Thinking and Observing for Outside-Knowledge VQAAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Q. Si
Yuchen Mo
Zheng Lin
Huishan Ji
Weiping Wang
181
20
0
10 May 2023
NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
Phillip Howard
Junlin Wang
Vasudev Lal
Gadi Singer
Yejin Choi
Swabha Swayamdipta
185
11
0
08 May 2023
Visual Reasoning: from State to Transformation
Visual Reasoning: from State to TransformationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Xin Hong
Yanyan Lan
Liang Pang
Jiafeng Guo
Xueqi Cheng
LRM
175
4
0
02 May 2023
A Symmetric Dual Encoding Dense Retrieval Framework for
  Knowledge-Intensive Visual Question Answering
A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question AnsweringAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Alireza Salemi
Juan Altmayer Pizzorno
Hamed Zamani
132
24
0
26 Apr 2023
FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual
  Question Answering
FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question AnsweringFindings (Findings), 2023
Weizhe Lin
Zhilin Wang
Bill Byrne
AAML
191
6
0
19 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Graph Neural Networks in Vision-Language Image Understanding: A SurveyThe Visual Computer (TVC), 2023
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
323
32
0
07 Mar 2023
VTQA: Visual Text Question Answering via Entity Alignment and
  Cross-Media Reasoning
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media ReasoningComputer Vision and Pattern Recognition (CVPR), 2023
Kan Chen
Xiangqian Wu
CoGe
167
19
0
05 Mar 2023
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on
  Tasks and Challenges
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on Tasks and Challenges
Maria Lymperaiou
Giorgos Stamou
VLM
236
5
0
04 Mar 2023
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question AnsweringIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Zhou Yu
Xuecheng Ouyang
Zhenwei Shao
Mei Wang
Jun Yu
MLLM
450
19
0
03 Mar 2023
Can Pre-trained Vision and Language Models Answer Visual
  Information-Seeking Questions?
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yang Chen
Hexiang Hu
Yi Luan
Haitian Sun
Soravit Changpinyo
Alan Ritter
Ming-Wei Chang
614
150
0
23 Feb 2023
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A SurveyArtificial Intelligence Review (AIR), 2023
Kunlin Wang
Zi Wang
Zhang Li
Ang Su
Xichao Teng
Minhao Liu
Qifeng Yu
Qifeng Yu
ObjD
686
28
0
21 Feb 2023
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Large-scale Multi-Modal Pre-trained Models: A Comprehensive SurveyMachine Intelligence Research (MIR), 2023
Tianlin Li
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CEVLM
477
272
0
20 Feb 2023
Benchmarks for Automated Commonsense Reasoning: A Survey
Benchmarks for Automated Commonsense Reasoning: A SurveyACM Computing Surveys (ACM Comput. Surv.), 2023
E. Davis
ELMLRM
299
80
0
09 Feb 2023
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution
  Generalization of VQA Models
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution Generalization of VQA Models
Ali Borji
CoGe
140
2
0
28 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language
  Models for Knowledge-based Visual Reasoning
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRMVLM
275
54
0
12 Jan 2023
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and
  Challenges
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges
R. Zakari
Jim Wilson Owusu
Hailin Wang
Ke Qin
Zaharaddeen Karami Lawal
Yue-hong Dong
LRM
183
18
0
26 Dec 2022
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with
  Multi-Source Multimodal Knowledge Memory
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge MemoryComputer Vision and Pattern Recognition (CVPR), 2022
Ziniu Hu
Ahmet Iscen
Chen Sun
Zirui Wang
Kai-Wei Chang
Luke Huan
Cordelia Schmid
David A. Ross
Alireza Fathi
RALMVLM
345
139
0
10 Dec 2022
Improving Commonsense in Vision-Language Models via Knowledge Graph
  Riddles
Improving Commonsense in Vision-Language Models via Knowledge Graph RiddlesComputer Vision and Pattern Recognition (CVPR), 2022
Shuquan Ye
Yujia Xie
Dongdong Chen
Yichong Xu
Lu Yuan
Chenguang Zhu
Jing Liao
VLM
136
18
0
29 Nov 2022
A survey on knowledge-enhanced multimodal learning
A survey on knowledge-enhanced multimodal learningArtificial Intelligence Review (Artif Intell Rev), 2022
Maria Lymperaiou
Giorgos Stamou
475
23
0
19 Nov 2022
Towards Reasoning-Aware Explainable VQA
Towards Reasoning-Aware Explainable VQA
Rakesh Vaideeswaran
Feng Gao
Abhinav Mathur
Govind Thattai
LRM
202
4
0
09 Nov 2022
VLC-BERT: Visual Question Answering with Contextualized Commonsense
  Knowledge
VLC-BERT: Visual Question Answering with Contextualized Commonsense KnowledgeIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Sahithya Ravi
Aditya Chinchure
Leonid Sigal
Renjie Liao
Vered Shwartz
150
44
0
24 Oct 2022
Previous
12345
Next
Page 2 of 5