ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.04509
  4. Cited By
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal
  Large Language Models Via Error Detection
v1v2 (latest)

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection

6 October 2024
Yibo Yan
Shen Wang
Jiahao Huo
Hang Li
Yangqiu Song
Jiamin Su
Xiong Gao
Yi-Fan Zhang
Tianlong Xu
Zhendong Chu
Aoxiao Zhong
Kun Wang
Hui Xiong
Philip S. Yu
Xuming Hu
Qingsong Wen
    LRM
ArXiv (abs)PDFHTML

Papers citing "ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection"

25 / 25 papers shown
Title
From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models
From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models
Wenxin Zhu
Andong Chen
Yuchen Song
Kehai Chen
Conghui Zhu
Ziyan Chen
Tiejun Zhao
LRM
402
0
0
17 Nov 2025
FractalBench: Diagnosing Visual-Mathematical Reasoning Through Recursive Program Synthesis
FractalBench: Diagnosing Visual-Mathematical Reasoning Through Recursive Program Synthesis
Jan Ondras
Marek Šuppa
LRM
64
0
0
09 Nov 2025
DocPruner: A Storage-Efficient Framework for Multi-Vector Visual Document Retrieval via Adaptive Patch-Level Embedding Pruning
DocPruner: A Storage-Efficient Framework for Multi-Vector Visual Document Retrieval via Adaptive Patch-Level Embedding Pruning
Yibo Yan
Guangwei Xu
Xin Zou
Shuliang Liu
James Kwok
Xuming Hu
172
4
0
28 Sep 2025
Enhancing Scientific Visual Question Answering via Vision-Caption aware Supervised Fine-Tuning
Enhancing Scientific Visual Question Answering via Vision-Caption aware Supervised Fine-Tuning
Janak Kapuriya
Anwar Shaikh
Arnav Goel
Medha Hira
Apoorv Singh
...
Sanjana
Vaibhav Nauriyal
Avinash Anand
Zhengkui Wang
R. Shah
72
0
0
20 Sep 2025
Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability
Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability
Haiqi Yang
Jinzhe Li
Gengxu Li
Yi-Ju Chang
Yuan Wu
MLLM
89
2
0
06 Aug 2025
GM-PRM: A Generative Multimodal Process Reward Model for Multimodal Mathematical Reasoning
GM-PRM: A Generative Multimodal Process Reward Model for Multimodal Mathematical Reasoning
Jianghangfan Zhang
Yibo Yan
Kening Zheng
Xin Zou
Song Dai
Xuming Hu
LRM
212
3
0
06 Aug 2025
VER-Bench: Evaluating MLLMs on Reasoning with Fine-Grained Visual Evidence
VER-Bench: Evaluating MLLMs on Reasoning with Fine-Grained Visual Evidence
Chenhui Qiang
Zhaoyang Wei
Xumeng Han Zipeng Wang
Zipeng Wang
Siyao Li
Xiangyuan Lan
Jianbin Jiao
Zhenjun Han
LRM
64
2
0
06 Aug 2025
Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble during Multi-Hop Analysis
Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble during Multi-Hop Analysis
Anushka Yadav
Isha Nalawade
Srujana Pillarichety
Yashwanth Babu
Reshmi Ghosh
Samyadeep Basu
Wenlong Zhao
Ali Nasaeh
S. Balasubramanian
Soundararajan Srinivasan
LRM
57
0
0
06 Aug 2025
Mis-prompt: Benchmarking Large Language Models for Proactive Error Handling
Mis-prompt: Benchmarking Large Language Models for Proactive Error HandlingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Jiayi Zeng
Yizhe Feng
Mengliang He
Wenhui Lei
Wei Zhang
Zeming Liu
Xiaoming Shi
Aimin Zhou
LRM
147
0
0
29 May 2025
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities
Junyan Zhang
Yubo Gao
Yibo Yan
Jia-Chen Gu
Zhaorui Hou
...
Qi Zheng
Song Dai
Yonghua Hei
Junzhuo Li
Xuming Hu
187
1
0
27 May 2025
PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions
PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions
Song Dai
Yibo Yan
Jiamin Su
Dongfang Zihao
Yubo Gao
...
Jia-Chen Gu
Junyan Zhang
Sicheng Tao
Zhuoran Gao
Xuming Hu
LRMAI4CE
246
4
0
21 May 2025
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency
Zhikai Wang
Jiashuo Sun
Weinan Zhang
Zhiqiang Hu
Xin Li
F. Wang
Deli Zhao
VLMLRM
401
6
0
24 Apr 2025
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
Wulin Xie
Yujiao Shi
Chaoyou Fu
Yang Shi
Bingyan Nie
Hongkai Chen
Zheng Zhang
Liang Wang
Tieniu Tan
344
7
0
04 Apr 2025
UniEDU: A Unified Language and Vision Assistant for Education Applications
UniEDU: A Unified Language and Vision Assistant for Education Applications
Zhendong Chu
Jian Xie
Shen Wang
Liang Luo
Qingsong Wen
AI4Ed
340
2
0
26 Mar 2025
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection
Yibo Yan
Shen Wang
Jiahao Huo
Philip S. Yu
Xuming Hu
Qingsong Wen
664
21
0
23 Mar 2025
LLM Agents for Education: Advances and Applications
LLM Agents for Education: Advances and Applications
Zhendong Chu
Shen Wang
Jian Xie
Tinghui Zhu
Yibo Yan
...
Aoxiao Zhong
Xuming Hu
Jing Liang
Philip S. Yu
Qingsong Wen
LLMAGELM
312
41
0
14 Mar 2025
From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education
From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education
Yi-Fan Zhang
Hang Li
D. Song
Shunian Chen
Tianlong Xu
Qingsong Wen
LLMAGLRM
299
3
0
20 Feb 2025
SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning
SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine UnlearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Junkai Chen
Zhijie Deng
Kening Zheng
Yibo Yan
Qi Zheng
PeiJun Wu
Peijie Jiang
Qingbin Liu
Xuming Hu
MU
434
16
0
18 Feb 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Xinzhe Ni
Zicheng Lin
...
Yiyao Yu
C. Shi
Ruihang Chu
Jin Zeng
Yujiu Yang
LRM
686
34
0
08 Jan 2025
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Mingyang Song
Zhaochen Su
Xiaoye Qu
Jiawei Zhou
Yu Cheng
LRM
606
63
0
06 Jan 2025
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in MedicineInformation Fusion (Inf. Fusion), 2024
Hanguang Xiao
Feizhong Zhou
Xianglong Liu
Tianqi Liu
Zhipeng Li
Xin Liu
Xiaoxuan Huang
AILawLM&MALRM
403
74
0
31 Dec 2024
Ask-Before-Detection: Identifying and Mitigating Conformity Bias in
  LLM-Powered Error Detector for Math Word Problem Solutions
Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem SolutionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Hang Li
Tianlong Xu
Kaiqi Yang
Yucheng Chu
Yanling Chen
Yichi Song
Qingsong Wen
Hui Liu
241
4
0
22 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
Shijie Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
393
48
0
03 Dec 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLMLRM
773
754
0
07 Mar 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALMLM&MAELM
751
735
0
09 Feb 2024
1