ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.14624
  4. Cited By
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
  Math Problems?

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

21 March 2024
Renrui Zhang
Dongzhi Jiang
Yichi Zhang
Haokun Lin
Ziyu Guo
Pengshuo Qiu
Aojun Zhou
Pan Lu
Kai-Wei Chang
Peng Gao
Hongsheng Li
ArXivPDFHTML

Papers citing "MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?"

42 / 42 papers shown
Title
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
D. Jiang
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
H. Li
LRM
57
0
0
01 May 2025
Fast-Slow Thinking for Large Vision-Language Model Reasoning
Fast-Slow Thinking for Large Vision-Language Model Reasoning
W. L. Xiao
Leilei Gan
Weilong Dai
Wanggui He
Ziwei Huang
...
Fangxun Shu
Zhelun Yu
Peng Zhang
Hao Jiang
Fei Wu
ReLM
LRM
AI4CE
60
0
0
25 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
D. Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
W. Wang
MLLM
VLM
57
6
1
14 Apr 2025
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang
C. Qu
Zuming Huang
Wei Chu
Fangzhen Lin
Wenhu Chen
OffRL
ReLM
SyDa
LRM
VLM
66
1
0
10 Apr 2025
OmniCaptioner: One Captioner to Rule Them All
OmniCaptioner: One Captioner to Rule Them All
Yiting Lu
Jiakang Yuan
Zhen Li
Shitian Zhao
Qi Qin
...
Lei Bai
Zhibo Chen
Peng Gao
Bo Zhang
Peng Gao
MLLM
76
0
0
09 Apr 2025
Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization
Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization
Ruichuan An
Kai Zeng
Ming Lu
Sihan Yang
Renrui Zhang
Huitong Ji
Qizhe Zhang
Y. Luo
Hao Liang
Wentao Zhang
54
0
0
17 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
69
7
0
16 Mar 2025
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
Hao Cui
Zahra Shamsi
Gowoon Cheon
Xuejian Ma
Shutong Li
...
Eun-Ah Kim
Michael P Brenner
Viren Jain
Sameera Ponda
Subhashini Venugopalan
ELM
LRM
41
0
0
14 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
S. Zhang
58
5
0
13 Mar 2025
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search
Yiming Jia
J. Li
Xiang Yue
Bo Li
Ping Nie
Kai Zou
Wenhu Chen
LRM
70
2
0
13 Mar 2025
Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?
Kun Xiang
Zhili Liu
Zihao Jiang
Yunshuang Nie
Kaixin Cai
...
Yu-Jie Yuan
J. Han
Lanqing Hong
Hang Xu
Xiaodan Liang
ReLM
LRM
49
6
0
08 Mar 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
56
8
0
21 Feb 2025
Forgotten Polygons: Multimodal Large Language Models are Shape-Blind
Forgotten Polygons: Multimodal Large Language Models are Shape-Blind
William Rudman
Michal Golovanesky
Amir Bar
Vedant Palit
Yann LeCun
Carsten Eickhoff
Ritambhara Singh
LRM
44
2
0
21 Feb 2025
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation
Junchen Fu
Xuri Ge
Kaiwen Zheng
Ioannis Arapakis
Xin Xin
J. Jose
72
0
0
20 Feb 2025
DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images
Sami Baral
L. Lucy
Ryan Knight
Alice Ng
Luca Soldaini
Neil T. Heffernan
Kyle Lo
36
3
0
28 Jan 2025
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Yilun Zhao
Lujing Xie
Haowei Zhang
Guo Gan
Yitao Long
...
Xiangru Tang
Zhenwen Liang
Y. Liu
Chen Zhao
Arman Cohan
43
5
0
21 Jan 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Yiyao Yu
Xinzhe Ni
Zicheng Lin
Jin Zeng
Yujiu Yang
LRM
45
12
0
08 Jan 2025
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Beichen Zhang
Yuhong Liu
Xiaoyi Dong
Yuhang Zang
Pan Zhang
Haodong Duan
Yuhang Cao
D. Lin
J. T. Wang
LRM
ReLM
45
2
0
06 Jan 2025
Chimera: Improving Generalist Model with Domain-Specific Experts
Chimera: Improving Generalist Model with Domain-Specific Experts
Tianshuo Peng
M. Li
Hongbin Zhou
Renqiu Xia
Renrui Zhang
...
Aojun Zhou
Botian Shi
Tao Chen
Bo Zhang
Xiangyu Yue
77
4
0
08 Dec 2024
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Lu Qiu
Yuying Ge
Yi Chen
Yixiao Ge
Ying Shan
Xihui Liu
LLMAG
LRM
78
5
0
05 Dec 2024
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Di Zhang
Jingdi Lei
Junxian Li
Xunzhi Wang
Y. Liu
...
S. M. I. Simon X. Yang
Jianbo Wu
Peng Ye
Wanli Ouyang
Dongzhan Zhou
OffRL
LRM
89
6
0
27 Nov 2024
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Weiyun Wang
Zhe Chen
Wenhai Wang
Yue Cao
Yangzhou Liu
...
Jinguo Zhu
X. Zhu
Lewei Lu
Yu Qiao
Jifeng Dai
LRM
44
45
1
15 Nov 2024
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Peng Xia
Siwei Han
Shi Qiu
Yiyang Zhou
Zhaoyang Wang
...
Chenhang Cui
Mingyu Ding
Linjie Li
Lijuan Wang
Huaxiu Yao
31
10
0
14 Oct 2024
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
Wenbo Hu
Jia-Chen Gu
Zi-Yi Dou
Mohsen Fayyaz
Pan Lu
Kai-Wei Chang
Nanyun Peng
VLM
36
4
0
10 Oct 2024
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Kai Chen
Yunhao Gou
Runhui Huang
Zhili Liu
Daxin Tan
...
Qun Liu
Jun Yao
Lu Hou
Hang Xu
Hang Xu
AuLLM
MLLM
VLM
48
21
0
26 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
42
10
0
23 Sep 2024
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
Kaichen Zhang
Bo Li
Peiyuan Zhang
Fanyi Pu
Joshua Adrian Cahyono
...
Shuai Liu
Yuanhan Zhang
Jingkang Yang
Chunyuan Li
Ziwei Liu
79
73
0
17 Jul 2024
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models
Pranshu Pandya
Agney S Talwarr
Vatsal Gupta
Tushar Kataria
Dan Roth
Vivek Gupta
LRM
30
2
0
15 Jul 2024
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
Shraman Pramanick
Rama Chellappa
Subhashini Venugopalan
30
13
0
12 Jul 2024
AudioBench: A Universal Benchmark for Audio Large Language Models
AudioBench: A Universal Benchmark for Audio Large Language Models
Bin Wang
Xunlong Zou
Geyu Lin
S.
Zhuohan Liu
Wenyu Zhang
Zhengyuan Liu
AiTi Aw
Nancy F. Chen
AuLLM
ELM
LM&MA
82
17
0
23 Jun 2024
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Zhen Huang
Zengzhi Wang
Shijie Xia
Xuefeng Li
Haoyang Zou
...
Yuxiang Zheng
Shaoting Zhang
Dahua Lin
Yu Qiao
Pengfei Liu
ELM
LRM
26
25
0
18 Jun 2024
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision
  Models
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Byung-Kwan Lee
Chae Won Kim
Beomchan Park
Yonghyun Ro
MLLM
LRM
19
17
0
24 May 2024
MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering
MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering
Avinash Anand
Janak Kapuriya
Chhavi Kirtani
Apoorv Singh
Jay Saraf
Naman Lal
Jatin Kumar
A. Shivam
Astha Verma
R. Shah
OffRL
34
9
0
19 Apr 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
116
106
0
08 Feb 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and
  Comprehension in Vision-Language Large Model
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
70
89
0
29 Jan 2024
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with
  Modality Collaboration
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
Qinghao Ye
Haiyang Xu
Jiabo Ye
Mingshi Yan
Anwen Hu
Haowei Liu
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLM
VLM
114
367
0
07 Nov 2023
MiniGPT-v2: large language model as a unified interface for
  vision-language multi-task learning
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
152
280
0
14 Oct 2023
VideoLLM: Modeling Video Sequence with Large Language Models
VideoLLM: Modeling Video Sequence with Large Language Models
Guo Chen
Yin-Dong Zheng
Jiahao Wang
Jilan Xu
Yifei Huang
...
Yi Wang
Yali Wang
Yu Qiao
Tong Lu
Limin Wang
MLLM
89
51
0
22 May 2023
mPLUG-Owl: Modularization Empowers Large Language Models with
  Multimodality
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
198
883
0
27 Apr 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
313
8,261
0
28 Jan 2022
1