Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs

1 June 2024

Papers citing "Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs"

2 / 2 papers shown

Title
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding Kenton Lee Mandar Joshi Iulia Turc Hexiang Hu Fangyu Liu Julian Martin Eisenschlos Urvashi Khandelwal Peter Shaw Ming-Wei Chang Kristina Toutanova CLIP VLM 148 259 0 07 Oct 2022
Unifying Vision-and-Language Tasks via Text Generation Jaemin Cho Jie Lei Hao Tan Mohit Bansal MLLM 249 518 0 04 Feb 2021