Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.05736
Cited By
Cross-modal Retrieval for Knowledge-based Visual Question Answering
11 January 2024
Paul Lerner
Olivier Ferret
C. Guinaudeau
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Cross-modal Retrieval for Knowledge-based Visual Question Answering"
3 / 3 papers shown
Title
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
181
307
0
02 Mar 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
219
2,404
0
04 Jan 2021
Image-embodied Knowledge Representation Learning
Ruobing Xie
Zhiyuan Liu
Huanbo Luan
Maosong Sun
122
211
0
22 Sep 2016
1