Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.01548
Cited By
mChartQA: A universal benchmark for multimodal Chart Question Answer based on Vision-Language Alignment and Reasoning
2 April 2024
Jingxuan Wei
Nan Xu
Guiyong Chang
Yin Luo
Bihui Yu
Ruifeng Guo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"mChartQA: A universal benchmark for multimodal Chart Question Answer based on Vision-Language Alignment and Reasoning"
2 / 2 papers shown
Title
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Martin Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
CLIP
VLM
148
259
0
07 Oct 2022
MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records
Zhen Xu
David R. So
Andrew M. Dai
Mamba
48
51
0
03 Feb 2021
1