ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.18521
  4. Cited By
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal
  LLMs

CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

26 June 2024
Zirui Wang
Mengzhou Xia
Luxi He
Howard Chen
Yitao Liu
Richard Zhu
Kaiqu Liang
Xindi Wu
Haotian Liu
Sadhika Malladi
Alexis Chevalier
Sanjeev Arora
Danqi Chen
ArXiv (abs)PDFHTMLHuggingFace (30 upvotes)Github (26894★)

Papers citing "CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs"

50 / 112 papers shown
VACoT: Rethinking Visual Data Augmentation with VLMs
VACoT: Rethinking Visual Data Augmentation with VLMs
Zhengzhuo Xu
Chong Sun
Sinan Du
Chen Li
Jing Lyu
Chun Yuan
117
2
0
02 Dec 2025
See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models
See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models
Le Thien Phuc Nguyen
Zhuoran Yu
Samuel Low Yu Hang
Subin An
J. Lee
...
SeungEun Chung
Thanh-Huy Nguyen
JuWan Maeng
Soochahn Lee
Yong Jae Lee
AuLLMVLM
232
6
0
01 Dec 2025
ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning
ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning
Zhengzhuo Xu
Sinan Du
Yiyan Qi
SiwenLu
Chengjin Xu
Chun Yuan
Jian Guo
LRM
322
2
0
29 Nov 2025
Qwen3-VL Technical Report
Qwen3-VL Technical Report
Shuai Bai
Yuxuan Cai
Ruizhe Chen
Keqin Chen
Xionghui Chen
...
Jingren Zhou
F. I. S. Kevin Zhou
J. Zhou
Yuanzhi Zhu
Ke Zhu
VLM
2.2K
446
0
26 Nov 2025
CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization
CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization
X. Hou
Shaoyuan Xu
Manan Biyani
Mayan Li
Jia-Wei Liu
Todd C. Hollon
Bryan Wang
188
4
0
24 Nov 2025
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Kaichen Zhang
Keming Wu
Zuhao Yang
Kairui Hu
Bin Wang
Ziwei Liu
X. Li
Xingxuan Li
Lidong Bing
OffRLReLMLRMVLM
304
12
0
20 Nov 2025
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data
Yunxin Li
Xinyu Chen
Shenyuan Jiang
Haoyuan Shi
Zhenyu Liu
...
Zhenran Xu
Yicheng Ma
Meishan Zhang
Baotian Hu
Min Zhang
MLLMMoEOSLMVLM
729
9
0
16 Nov 2025
DeepEyesV2: Toward Agentic Multimodal Model
DeepEyesV2: Toward Agentic Multimodal ModelIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025
Jack Hong
Chenxiao Zhao
ChengLin Zhu
Weiheng Lu
Guohai Xu
Xing Yu
181
35
0
07 Nov 2025
V-Thinker: Interactive Thinking with Images
V-Thinker: Interactive Thinking with Images
Runqi Qiao
Qiuna Tan
Minghan Yang
Guanting Dong
Peiqing Yang
...
Lan Yang
Chong Sun
Chen Li
Honggang Zhang
Honggang Zhang
MLLMLRM
496
6
0
06 Nov 2025
NVIDIA Nemotron Nano V2 VL
NVIDIA Nemotron Nano V2 VL
Nvidia
Amala Sanjay Deshmukh
Kateryna Chumachenko
Tuomas Rintamaki
Matthieu Le
...
Krzysztof Pawelec
Michael Evans
Katherine Luna
Jie Lou
Erick Galinkin
VLM
401
5
0
06 Nov 2025
ChartM$^3$: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension
ChartM3^33: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart ComprehensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Duo Xu
Hao Cheng
Xin Lin
Zhen Xie
Hao Wang
LRM
200
0
0
04 Nov 2025
The Ouroboros of Benchmarking: Reasoning Evaluation in an Era of Saturation
The Ouroboros of Benchmarking: Reasoning Evaluation in an Era of Saturation
İbrahim Ethem Deveci
Duygu Ataman
ReLMALMELMLRM
274
1
0
03 Nov 2025
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning
Ming Li
Jike Zhong
Shitian Zhao
H. Zhang
Shaoheng Lin
Yuxiang Lai
Chen Wei
Konstantinos Psounis
Kaipeng Zhang
EGVMLRMVLM
573
9
0
03 Nov 2025
ConnectomeBench: Can LLMs Proofread the Connectome?
ConnectomeBench: Can LLMs Proofread the Connectome?
Jeff Brown
Andrew Kirjner Annika Vivekananthan
Ed Boyden
MLLM
176
1
0
31 Oct 2025
ChartAB: A Benchmark for Chart Grounding & Dense Alignment
ChartAB: A Benchmark for Chart Grounding & Dense Alignment
Aniruddh Bansal
Davit Soselia
Dang Nguyen
Tianyi Zhou
237
0
0
30 Oct 2025
A Survey of AI Scientists
A Survey of AI Scientists
Guiyao Tie
P. Zhou
Lichao Sun
AI4TS
458
3
0
27 Oct 2025
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence
Qiushi Sun
Jingyang Gong
Yang Liu
Qiaosheng Chen
Lei Li
Kai Chen
Qipeng Guo
B. Kao
Fei Yuan
228
2
0
27 Oct 2025
A Coherence-Based Measure of AGI
A Coherence-Based Measure of AGI
Fares Fourati
164
0
0
23 Oct 2025
Structured and Abstractive Reasoning on Multi-modal Relational Knowledge Images
Structured and Abstractive Reasoning on Multi-modal Relational Knowledge Images
Yichi Zhang
Zhuo Chen
Lingbing Guo
Lei Liang
Wen Zhang
H. Chen
170
0
0
22 Oct 2025
UNO-Bench: A Unified Benchmark for Exploring the Compositional Law Between Uni-modal and Omni-modal in Omni Models
UNO-Bench: A Unified Benchmark for Exploring the Compositional Law Between Uni-modal and Omni-modal in Omni Models
Chen Chen
ZeYang Hu
Fengjiao Chen
Liya Ma
Jiaxing Liu
Xiaoyu Li
Xuezhi Cao
Xuezhi Cao
Xunliang Cai
226
0
0
21 Oct 2025
Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input
Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input
Chenxu Li
Zhicai Wang
Yuan Sheng
Xingyu Zhu
Y. Hao
Xiang Wang
AAML
273
1
0
19 Oct 2025
MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models
MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models
Young-Jun Lee
Byung-Kwan Lee
Jianshu Zhang
Yechan Hwang
ByungSoo Ko
...
Xuankun Rong
Eojin Joo
Seung-Ho Han
Bowon Ko
Ho-Jin Choi
LRM
182
8
0
18 Oct 2025
Composition-Grounded Data Synthesis for Visual Reasoning
Composition-Grounded Data Synthesis for Visual Reasoning
Xinyi Gu
Jiayuan Mao
Zhang-Wei Hong
Zhuoran Yu
Pengyuan Li
Dhiraj Joshi
Rogerio Feris
Zexue He
ReLMLRM
177
0
0
16 Oct 2025
RECODE: Reasoning Through Code Generation for Visual Question Answering
RECODE: Reasoning Through Code Generation for Visual Question Answering
Junhong Shen
Mu Cai
Bo Hu
Ameet Talwalkar
David A. Ross
Cordelia Schmid
Alireza Fathi
ReLMLRM
198
0
0
15 Oct 2025
Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning
Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning
Xingang Guo
Utkarsh Tyagi
Advait Gosai
Paula Vergara
Ernesto Gabriel Hernández Montoya
...
Bin Hu
Yunzhong He
Bing Liu
Bing Liu
Rakshith S Srinivasa
VLMLRM
365
8
0
14 Oct 2025
A Survey on Agentic Multimodal Large Language Models
A Survey on Agentic Multimodal Large Language Models
Huanjin Yao
Ruifei Zhang
Jiaxing Huang
Jingyi Zhang
Yibo Wang
...
Ruolin Zhu
Yongcheng Jing
Shunyu Liu
Guanbin Li
Dacheng Tao
LM&RoAIFinAI4TSLRMAI4CE
302
12
0
13 Oct 2025
Towards Efficient Multimodal Unified Reasoning Model via Model Merging
Towards Efficient Multimodal Unified Reasoning Model via Model Merging
Qixiang Yin
Huanjin Yao
Jianghao Chen
Jiaxing Huang
Z. Zhao
Fei Su
LRMMoMe
355
1
0
10 Oct 2025
ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping
ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping
Shuang Chen
Yue Guo
Yimeng Ye
Shijue Huang
Wenbo Hu
Haoxi Li
Manyuan Zhang
Jiayu Chen
Song Guo
Nanyun Peng
LRM
202
9
0
09 Oct 2025
ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models
ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models
Zhangyue Yin
Qiushi Sun
Zhiyuan Zeng
Zhiyuan Yu
Zengfeng Huang
Xuanjing Huang
Xipeng Qiu
LRM
150
0
0
07 Oct 2025
Large Language Models Achieve Gold Medal Performance at the International Olympiad on Astronomy & Astrophysics (IOAA)
Large Language Models Achieve Gold Medal Performance at the International Olympiad on Astronomy & Astrophysics (IOAA)
Lucas Carrit Delgado Pinheiro
Ziru Chen
Bruno Caixeta Piazza
Ness B. Shroff
Yingbin Liang
Yuan-Sen Ting
Huan Sun
ALMELMLRM
173
1
0
06 Oct 2025
ContextNav: Towards Agentic Multimodal In-Context Learning
ContextNav: Towards Agentic Multimodal In-Context Learning
Honghao Fu
Yuan Ouyang
Kai-Wei Chang
Yiwei Wang
Zi Huang
Yujun Cai
208
1
0
06 Oct 2025
Factuality Matters: When Image Generation and Editing Meet Structured Visuals
Factuality Matters: When Image Generation and Editing Meet Structured Visuals
Le Zhuo
Songhao Han
Yuandong Pu
Boxiang Qiu
Sayak Paul
...
Yihao Liu
Jie Shao
Xi Chen
Si Liu
Hongsheng Li
EGVM
281
8
0
06 Oct 2025
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
Hang Wu
Yujun Cai
Haonan Ge
H. Chen
Ming-Hsuan Yang
Yiwei Wang
CoGe
200
3
0
02 Oct 2025
What MLLMs Learn about When they Learn about Multimodal Reasoning: Perception, Reasoning, or their Integration?
What MLLMs Learn about When they Learn about Multimodal Reasoning: Perception, Reasoning, or their Integration?
Jiwan Chung
Neel Joshi
Pratyusha Sharma
Youngjae Yu
Vibhav Vineet
LRM
233
3
0
02 Oct 2025
PixelCraft: A Multi-Agent System for High-Fidelity Visual Reasoning on Structured Images
PixelCraft: A Multi-Agent System for High-Fidelity Visual Reasoning on Structured Images
Shuoshuo Zhang
Zijian Li
Yizhen Zhang
Jingjing Fu
Lei Song
Jiang Bian
Jun Zhang
Y. Yang
Rui Wang
LRM
192
6
0
29 Sep 2025
AstroMMBench: A Benchmark for Evaluating Multimodal Large Language Models Capabilities in Astronomy
AstroMMBench: A Benchmark for Evaluating Multimodal Large Language Models Capabilities in Astronomy
Jinghang Shi
Xiao Yu Tang
Yang Hunag
Yuyang Li
Xiaokong
Yanxia Zhang
Caizhan Yue
230
1
0
29 Sep 2025
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training
Xiang An
Yin Xie
Kaicheng Yang
Wenkang Zhang
X. Zhao
...
Ziyong Feng
Ziwei Liu
Bo Li
Jiankang Deng
Jiankang Deng
MLLMVLMSyDa
444
100
0
28 Sep 2025
CoFFT: Chain of Foresight-Focus Thought for Visual Language Models
CoFFT: Chain of Foresight-Focus Thought for Visual Language Models
Xinyu Zhang
Yuxuan Dong
L. Zhang
Chengyou Jia
Zhuohang Dang
Basura Fernando
Jun Liu
Mike Zheng Shou
LRM
347
1
0
26 Sep 2025
Chimera: Diagnosing Shortcut Learning in Visual-Language Understanding
Chimera: Diagnosing Shortcut Learning in Visual-Language Understanding
Ziheng Chi
Yifan Hou
Chenxi Pang
Shaobo Cui
Mubashara Akhtar
Mrinmaya Sachan
175
0
0
26 Sep 2025
CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition
CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition
Sina J. Semnani
Han Zhang
Xinyan He
Merve Tekgürler
Monica S. Lam
3DV
187
1
0
24 Sep 2025
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
Teng Xiao
Zuchao Li
Lefei Zhang
302
1
0
23 Sep 2025
Losing the Plot: How VLM responses degrade on imperfect charts
Losing the Plot: How VLM responses degrade on imperfect charts
P. W. Shin
Jack Sampson
Vijaykrishnan Narayanan
Andres Marquez
Mahantesh Halappanavar
136
1
0
22 Sep 2025
Visual Programmability: A Guide for Code-as-Thought in Chart Understanding
Visual Programmability: A Guide for Code-as-Thought in Chart Understanding
Bohao Tang
Yan Ma
Fei Zhang
Jiadi Su
Ethan Chern
Zhulin Hu
Zhixin Wang
Pengfei Liu
Ya Zhang
LRM
197
0
0
11 Sep 2025
A Comprehensive Survey on Trustworthiness in Reasoning with Large Language Models
A Comprehensive Survey on Trustworthiness in Reasoning with Large Language Models
Yanbo Wang
Yongcan Yu
Jian Liang
Ran He
HILMLRM
240
13
0
04 Sep 2025
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
Qi Yang
Bolin Ni
Shiming Xiang
Han Hu
Houwen Peng
Jie Jiang
LRM
316
8
0
28 Aug 2025
Do MLLMs Really Understand the Charts?
Do MLLMs Really Understand the Charts?
Xiao-Yu Zhang
Dongyuan Li
Liuyu Xiang
Yao Zhang
Cheng Zhong
Zhaofeng He
LRM
222
2
0
27 Aug 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Weiyun Wang
Zhangwei Gao
Lixin Gu
Hengjun Pu
Long Cui
...
Bowen Zhou
Kai Chen
Yu Qiao
Wenhai Wang
Gen Luo
MLLMLRM
361
525
0
25 Aug 2025
DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards
DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards
Aaryaman Kartha
Ahmed Masry
Mohammed Saidul Islam
Thinh Lang
Shadikur Rahman
...
Mizanur Rahman
Mahir Ahmed
Md. Rizwan Parvez
Enamul Hoque
Shafiq Joty
110
0
0
24 Aug 2025
XFinBench: Benchmarking LLMs in Complex Financial Problem Solving and Reasoning
XFinBench: Benchmarking LLMs in Complex Financial Problem Solving and ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zhihan Zhang
Yixin Cao
Lizi Liao
AIFin
316
8
0
20 Aug 2025
Vision-G1: Towards General Vision Language Reasoning with Multi-Domain Data Curation
Vision-G1: Towards General Vision Language Reasoning with Multi-Domain Data Curation
Yuheng Zha
Kun Zhou
Yujia Wu
Yushu Wang
Jie Feng
Zhi Xu
Shibo Hao
Zhengzhong Liu
Eric P. Xing
Zhiting Hu
LRMVLM
252
4
0
18 Aug 2025
123
Next
Page 1 of 3