Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.05433
Cited By
v1
v2
v3
v4 (latest)
FVQA: Fact-based Visual Question Answering
17 June 2016
Peng Wang
Qi Wu
Chunhua Shen
Anton van den Hengel
A. Dick
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FVQA: Fact-based Visual Question Answering"
50 / 225 papers shown
Title
UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models
Yanyang Guo
Fangkai Jiao
Zhiqi Shen
Liqiang Nie
Mohan S. Kankanhalli
MLLM
79
7
0
17 Oct 2023
Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering
Weizhe Lin
Jinghong Chen
Jingbiao Mei
Alexandru Coca
Bill Byrne
58
37
0
29 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
94
4
0
05 Sep 2023
CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection
Jiajin Tang
Ge Zheng
Jingyi Yu
Sibei Yang
ObjD
75
22
0
03 Sep 2023
Diagnosing Human-object Interaction Detectors
Fangrui Zhu
Yiming Xie
Weidi Xie
Huaizu Jiang
70
8
0
16 Aug 2023
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
126
22
0
21 Jul 2023
Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question Answering
Alireza Salemi
Mahta Rafiee
Hamed Zamani
69
10
0
28 Jun 2023
Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories
Thomas Mensink
J. Uijlings
Lluis Castrejon
A. Goel
Felipe Cadar
Howard Zhou
Fei Sha
A. Araújo
V. Ferrari
83
44
0
15 Jun 2023
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Difei Gao
Lei Ji
Luowei Zhou
Kevin Lin
Joya Chen
Zihan Fan
Mike Zheng Shou
MLLM
88
76
0
14 Jun 2023
End-to-end Knowledge Retrieval with Multi-modal Queries
Man Luo
Zhiyuan Fang
Tejas Gokhale
Yezhou Yang
Chitta Baral
VLM
70
19
0
01 Jun 2023
Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Xingyu Fu
Shenmin Zhang
Gukyeong Kwon
Pramuditha Perera
Henghui Zhu
...
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Dan Roth
Bing Xiang
73
22
0
30 May 2023
KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Zhiwei Jia
P. Narayana
Arjun Reddy Akula
G. Pruthi
Haoran Su
Sugato Basu
Varun Jampani
VLM
OffRL
79
4
0
28 May 2023
i-Code Studio: A Configurable and Composable Framework for Integrative AI
Yuwei Fang
Mahmoud Khademi
Chenguang Zhu
Ziyi Yang
Reid Pryzant
...
Yao Qian
Takuya Yoshioka
Lu Yuan
Michael Zeng
Xuedong Huang
74
2
0
23 May 2023
Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature
Ana Claudia Akemi Matsuki de Faria
Felype de Castro Bastos
Jose Victor Nogueira Alves da Silva
Vitor Lopes Fabris
Valeska Uchôa
Décio Gonccalves de Aguiar Neto
C. F. G. Santos
68
27
0
18 May 2023
Combo of Thinking and Observing for Outside-Knowledge VQA
Q. Si
Yuchen Mo
Zheng Lin
Huishan Ji
Weiping Wang
95
14
0
10 May 2023
NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
Phillip Howard
Junlin Wang
Vasudev Lal
Gadi Singer
Yejin Choi
Swabha Swayamdipta
91
9
0
08 May 2023
Visual Reasoning: from State to Transformation
Xin Hong
Yanyan Lan
Liang Pang
Jiafeng Guo
Xueqi Cheng
LRM
48
4
0
02 May 2023
A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering
Alireza Salemi
Juan Altmayer Pizzorno
Hamed Zamani
38
15
0
26 Apr 2023
FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering
Weizhe Lin
Zhilin Wang
Bill Byrne
AAML
110
4
0
19 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
74
21
0
07 Mar 2023
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Kan Chen
Xiangqian Wu
CoGe
52
9
0
05 Mar 2023
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on Tasks and Challenges
Maria Lymperaiou
Giorgos Stamou
VLM
89
4
0
04 Mar 2023
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
Zhou Yu
Xuecheng Ouyang
Zhenwei Shao
Mei Wang
Jun Yu
MLLM
178
11
0
03 Mar 2023
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
Yang Chen
Hexiang Hu
Yi Luan
Haitian Sun
Soravit Changpinyo
Alan Ritter
Ming-Wei Chang
126
94
0
23 Feb 2023
Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey
Kunlin Wang
Zi Wang
Zhang Li
Ang Su
Xichao Teng
Minhao Liu
Qifeng Yu
Qifeng Yu
ObjD
205
9
0
21 Feb 2023
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Tianlin Li
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CE
VLM
139
213
0
20 Feb 2023
Benchmarks for Automated Commonsense Reasoning: A Survey
E. Davis
ELM
LRM
90
63
0
09 Feb 2023
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution Generalization of VQA Models
Ali Borji
CoGe
40
1
0
28 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
99
41
0
12 Jan 2023
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges
R. Zakari
Jim Wilson Owusu
Hailin Wang
Ke Qin
Zaharaddeen Karami Lawal
Yue-hong Dong
LRM
61
16
0
26 Dec 2022
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Ziniu Hu
Ahmet Iscen
Chen Sun
Zirui Wang
Kai-Wei Chang
Yizhou Sun
Cordelia Schmid
David A. Ross
Alireza Fathi
RALM
VLM
98
96
0
10 Dec 2022
Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles
Shuquan Ye
Yujia Xie
Dongdong Chen
Yichong Xu
Lu Yuan
Chenguang Zhu
Jing Liao
VLM
66
12
0
29 Nov 2022
A survey on knowledge-enhanced multimodal learning
Maria Lymperaiou
Giorgos Stamou
153
15
0
19 Nov 2022
Towards Reasoning-Aware Explainable VQA
Rakesh Vaideeswaran
Feng Gao
Abhinav Mathur
Govind Thattai
LRM
73
3
0
09 Nov 2022
VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
Sahithya Ravi
Aditya Chinchure
Leonid Sigal
Renjie Liao
Vered Shwartz
66
29
0
24 Oct 2022
Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
Jialin Wu
Raymond J. Mooney
RALM
131
11
0
18 Oct 2022
COFAR: Commonsense and Factual Reasoning in Image Search
Prajwal Gatti
A. S. Penamakuri
Revant Teotia
Anand Mishra
Shubhashis Sengupta
Roshni Ramnani
ReLM
LRM
32
4
0
16 Oct 2022
TransAlign: Fully Automatic and Effective Entity Alignment for Knowledge Graphs
Rui Zhang
Xiaoyan Zhao
Bayu Distiawan Trisedya
Min Yang
Hong Cheng
Jianzhong Qi
51
0
0
16 Oct 2022
Retrieval Augmented Visual Question Answering with Outside Knowledge
Weizhe Lin
Bill Byrne
RALM
105
77
0
07 Oct 2022
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen
Yushuang Wu
Qiyuan Dai
Hong-Yu Zhou
Mutian Xu
Sibei Yang
Xiaoguang Han
Yizhou Yu
ViT
MedIm
AI4CE
137
79
0
27 Sep 2022
CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning
Adam Dahlgren Lindström
Savitha Sam Abraham
55
58
0
10 Aug 2022
LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection
Zhuo Chen
Yufen Huang
Jiaoyan Chen
Yuxia Geng
Yin Fang
Jeff Z. Pan
Ningyu Zhang
Wen Zhang
89
37
0
26 Jul 2022
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem
Yudong Han
Liqiang Nie
Jianhua Yin
Jianlong Wu
Yan Yan
80
14
0
24 Jul 2022
Semantic-aware Modular Capsule Routing for Visual Question Answering
Yudong Han
Jianhua Yin
Jianlong Wu
Yin-wei Wei
Liqiang Nie
57
7
0
21 Jul 2022
A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA
Yangyang Guo
Liqiang Nie
Yongkang Wong
Yebin Liu
Zhiyong Cheng
Mohan S. Kankanhalli
119
40
0
30 Jun 2022
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation
Kshitij Gupta
Devansh Gautam
R. Mamidi
VLM
70
4
0
07 Jun 2022
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge
Dustin Schwenk
Apoorv Khandelwal
Christopher Clark
Kenneth Marino
Roozbeh Mottaghi
74
555
0
03 Jun 2022
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Yuanze Lin
Yujia Xie
Dongdong Chen
Yichong Xu
Chenguang Zhu
Lu Yuan
86
74
0
02 Jun 2022
TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages
Zihan Zhao
Lu Chen
Ruisheng Cao
Hongshen Xu
Xingyu Chen
Kai Yu
83
9
0
13 May 2022
Hypergraph Transformer: Weakly-supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering
Y. Heo
Eun-Sol Kim
Woo Suk Choi
Byoung-Tak Zhang
62
28
0
22 Apr 2022
Previous
1
2
3
4
5
Next