Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2502.09621
Cited By
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
13 February 2025
Dongzhi Jiang
Renrui Zhang
Ziyu Guo
Yanwei Li
Yu Qi
Xinyan Chen
Liuhui Wang
Jianhan Jin
Claire Guo
Shen Yan
Bo Zhang
Chaoyou Fu
Peng Gao
Jiaming Song
MLLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (28 upvotes)
Github
Papers citing
"MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency"
50 / 70 papers shown
When to Think and When to Look: Uncertainty-Guided Lookback
Jing Bi
Filippos Bellos
Junjia Guo
Yayuan Li
Chao Huang
...
Luchuan Song
Luchuan Song
Susan Liang
Zhongfei
Zhang
MLLM
LRM
343
1
0
30 Mar 2026
Reinforcement Learning for Large Model: A Survey
Weijia Wu
Chen Gao
Joya Chen
Kevin Lin
Qingwei Meng
Yiming Zhang
Yuke Qiu
Hong Zhou
Mike Zheng Shou
408
2
0
24 Dec 2025
DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
Dongzhi Jiang
Renrui Zhang
Haodong Li
Zhuofan Zong
Ziyu Guo
...
J. C. Ye
Rongyao Fang
Weijia Li
R. Liu
Hongsheng Li
AI4TS
VLM
LRM
201
2
0
04 Dec 2025
Probing the "Psyche'' of Large Reasoning Models: Understanding Through a Human Lens
Yuxiang Chen
Zuohan Wu
Z. Wang
Xiangning Yu
Xujia Li
Linyi Yang
Mengyue Yang
Jun Wang
Lei Chen
LRM
207
0
0
30 Nov 2025
AgroCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture
Yibin Wen
Qingmei Li
Zi Ye
Jiarui Zhang
Jing Wu
...
Haohuan Fu
Huang Jianxi
Juepeng Zheng
Jianxi Huang
Juepeng Zheng
LRM
205
0
0
28 Nov 2025
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action
Zhen Fang
Zhuoyang Liu
Jiaming Liu
Hao Chen
Y. Zeng
Shiting Huang
Zehui Chen
L. Chen
Shanghang Zhang
Feng Zhao
LRM
147
4
0
27 Nov 2025
Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following
Tianyi Xiong
Yi Ge
Ming Li
Zuolong Zhang
Pranav Kulkarni
...
Yanshuo Chen
X. Wang
Renrui Zhang
Wenhu Chen
Heng Huang
ELM
277
5
0
26 Nov 2025
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Cheng Yang
Haiyuan Wan
Yiran Peng
Xin Cheng
Quan Shi
...
Junchi Yu
Xinlei Yu
Xiawu Zheng
D. Zhou
Chenglin Wu
ReLM
LRM
378
9
0
19 Nov 2025
From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models
Wenxin Zhu
Andong Chen
Yuchen Song
Kehai Chen
Conghui Zhu
Ziyan Chen
Tiejun Zhao
LRM
529
1
0
17 Nov 2025
Diagnosing Hallucination Risk in AI Surgical Decision-Support: A Sequential Framework for Sequential Validation
Dong Chen
Yanzhe Wei
Zonglin He
Guan-Ming Kuang
Canhua Ye
Meiru An
Huili Peng
Yong Hu
Huiren Tao
Kenneth MC. Cheung
LRM
131
2
0
01 Nov 2025
MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language Models
Zixin Chen
Hongzhan Lin
Kaixin Li
Ziyang Luo
Yayue Deng
Jing Ma
151
0
0
31 Oct 2025
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark
Ziyu Guo
Xinyan Chen
Renrui Zhang
Ruichuan An
Yu Qi
Dongzhi Jiang
Xiangtai Li
M. Zhang
Jiaming Song
Pheng-Ann Heng
VGen
LRM
241
25
0
30 Oct 2025
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Jiawei Gu
Yunzhuo Hao
Huichen Will Wang
Linjie Li
Michael Qizhe Shieh
Yejin Choi
Ranjay Krishna
Yu Cheng
LM&Ro
LRM
440
15
0
30 Oct 2025
PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection
Yusu Qian
Cheng Wan
Chao Jia
Yinfei Yang
Qingyu Zhao
Zhe Gan
LRM
ReLM
575
3
0
27 Oct 2025
S-Chain: Structured Visual Chain-of-Thought For Medicine
Khai Le-Duc
Duy Khuong Nguyen
Phuong T. H. Trinh
T. Nguyen
Nghiem Tuong Diep
...
P. Xie
Daniel Sonntag
James Y. Zou
Mathias Niepert
Anh Totti Nguyen
LRM
175
4
0
26 Oct 2025
NoisyGRPO: Incentivizing Multimodal CoT Reasoning via Noise Injection and Bayesian Estimation
Longtian Qiu
Shan Ning
Jiaxuan Sun
Xuming He
NoLa
OffRL
LRM
525
4
0
24 Oct 2025
What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation
Heejin Do
Jaehui Hwang
Dongyoon Han
Seong Joon Oh
Sangdoo Yun
ELM
LRM
281
4
1
23 Oct 2025
A Survey on Agentic Multimodal Large Language Models
Huanjin Yao
Ruifei Zhang
Jiaxing Huang
Jingyi Zhang
Yibo Wang
...
Ruolin Zhu
Yongcheng Jing
Shunyu Liu
Guanbin Li
Dacheng Tao
LM&Ro
AIFin
AI4TS
LRM
AI4CE
302
12
0
13 Oct 2025
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
Caorui Li
Yu Chen
Yiyan Ji
Jin Xu
Zhenyu Cui
...
Minghao Liu
Junran Peng
Zhaoxiang Zhang
Jiaheng Liu
Jiaheng Liu
AuLLM
LRM
241
16
0
12 Oct 2025
BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception
Junyan Ye
Dongzhi Jiang
Jun-Jian He
Baichuan Zhou
Zilong Huang
Zhiyuan Yan
Jiaming Song
Conghui He
Weijia Li
ReLM
VLM
LRM
184
4
0
10 Oct 2025
AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond
Shangding Gu
Xiaohan Wang
Donghao Ying
Haoyu Zhao
Runing Yang
...
Marco Pavone
Serena Yeung-Levy
Jun Wang
Dawn Song
C. Spanos
147
2
0
30 Sep 2025
Visual CoT Makes VLMs Smarter but More Fragile
Chunxue Xu
Yiwei Wang
Yujun Cai
Bryan Hooi
Songze Li
MLLM
LRM
177
0
0
28 Sep 2025
Understanding-in-Generation: Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation
Yuanhuiyi Lyu
Chi Kit Wong
Chenfei Liao
Lutao Jiang
Xu Zheng
Zexin Lu
Linfeng Zhang
Xuming Hu
401
4
0
23 Sep 2025
From Easy to Hard: The MIR Benchmark for Progressive Interleaved Multi-Image Reasoning
Hang Du
Jiayang Zhang
Guoshun Nan
Wendi Deng
Zhenyan Chen
...
Wang Xiao
Shan Huang
Yuqi Pan
Tao Qi
Sicong Leng
VLM
275
2
0
21 Sep 2025
DreamPRM-1.5: Unlocking the Potential of Each Instance for Multimodal Process Reward Model Training
Qi Cao
P. Xie
OffRL
202
0
0
05 Sep 2025
MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models
Jiacheng Ruan
Dan Jiang
Xian Gao
Ting Liu
Yuzhuo Fu
Yangyang Kang
ELM
LRM
176
2
0
19 Aug 2025
RISE: Enhancing VLM Image Annotation with Self-Supervised Reasoning
Suhang Hu
Wei Hu
Yuhang Su
Fan Zhang
ReLM
LRM
VLM
439
0
0
17 Aug 2025
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation
Junyan Ye
Shihong Deng
Zihao Wang
Leqi Zhu
Zhenghao Hu
...
Zhiyuan Yan
Jinghua Yu
Jiaming Song
Conghui He
Weijia Li
VLM
281
61
0
13 Aug 2025
MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models
Fan Zhang
Minghan Li
Chong Deng
Xue Yang
Zheng Lian
...
Xian Wu
Kun Wang
Xiangang Li
Jieping Ye
Pheng-Ann Heng
AI4MH
270
5
0
11 Aug 2025
Humans Perceive Wrong Narratives from AI Reasoning Texts
Mosh Levy
Zohar Elyoseph
Yoav Goldberg
226
6
0
09 Aug 2025
ConfProBench: A Confidence Evaluation Benchmark for MLLM-Based Process Judges
Yue Zhou
Yi-Ju Chang
Yuan Wu
LRM
112
1
0
06 Aug 2025
FinMMR: Make Financial Numerical Reasoning More Multimodal, Comprehensive, and Challenging
Zichen Tang
Haihong E
Jiacheng Liu
Zhongjun Yang
Rongjin Li
...
Yiling Huang
Xinyi Hu
Qing Huang
Zijian Xie
Shiyao Peng
205
6
0
06 Aug 2025
Cognitive Chain-of-Thought: Structured Multimodal Reasoning about Social Situations
Eunkyu Park
Wesley Hanwen Deng
Gunhee Kim
Motahhare Eslami
Maarten Sap
LRM
201
3
0
27 Jul 2025
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
W. Zhao
Chaoyi Wu
Yanjie Fan
Xiaoman Zhang
Pengcheng Qiu
...
Xin Sun
Ya Zhang
Yongguo Yu
Kun Sun
Weidi Xie
LRM
201
20
0
25 Jun 2025
MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
Yinan Xia
Yilei Jiang
Yingshui Tan
Xiaoyong Zhu
Xiangyu Yue
Bo Zheng
LRM
185
3
0
24 Jun 2025
Mimicking or Reasoning: Rethinking Multi-Modal In-Context Learning in Vision-Language Models
Chengyue Huang
Yuchen Zhu
Sichen Zhu
Jingyun Xiao
Moises Andrade
Shivang Chopra
Z. Kira
ReLM
VLM
LRM
191
5
0
09 Jun 2025
Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT
Zhuobai Dong
Junchao Yi
Ziyuan Zheng
Haochen Han
Xiangxi Zheng
Alex Jinpeng Wang
Fangming Liu
Linjie Li
ReLM
LRM
314
2
0
30 May 2025
Reinforcing Video Reasoning with Focused Thinking
Jisheng Dang
Jingze Wu
T. Wang
Xuanhui Lin
Nannan Zhu
Hongbo Chen
Wei-Shi Zheng
Meng Wang
Tat-Seng Chua
OffRL
LRM
410
18
0
30 May 2025
THINK-Bench: Evaluating Thinking Efficiency and Chain-of-Thought Quality of Large Reasoning Models
Zhiyuan Li
Yi-Ju Chang
Yuan Wu
LLMAG
LRM
286
9
0
28 May 2025
DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning
Qi Cao
Ruiyi Wang
Ruiyi Zhang
Sai Ashish Somayajula
P. Xie
LRM
479
13
0
26 May 2025
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Haoyuan Sun
Jiaqi Wu
Bo Xia
Yifu Luo
Yifei Zhao
Kai Qin
Xufei Lv
Tiantian Zhang
Yongzhe Chang
Xueqian Wang
OffRL
LRM
489
11
0
24 May 2025
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
C. Wang
Xiaojun Ye
Xiaoran Pan
Zihao Pan
Haofan Wang
Yiren Song
LRM
660
8
0
24 May 2025
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
Shilin Yan
Jiaming Han
Joey Tsai
Hongwei Xue
Rongyao Fang
Lingyi Hong
Ziyu Guo
Ray Zhang
VLM
364
10
0
22 May 2025
Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning
Siqu Ou
Hongcheng Liu
Pingjie Wang
Yusheng Liao
Chuan Xuan
Yanfeng Wang
Yu Wang
LRM
229
1
0
22 May 2025
ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations
Xuecheng Wu
Jiaxing Liu
Danlei Huang
Xiaoyu Li
Yifan Wang
...
Liya Ma
Xuezhi Cao
Junxiao Xue
Hairong Dong
Dingkang Yang
LRM
471
8
0
20 May 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
Shihong Deng
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
Haoyang Li
LRM
536
121
0
01 May 2025
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
Guanghao Zhou
Panjia Qiu
Chong Chen
Jiadong Wang
Zheming Yang
Jian Xu
Minghui Qiu
OffRL
LRM
673
26
0
30 Apr 2025
GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets
Mingqian He
Fei Zhao
Chonggang Lu
Ziqiang Liu
Yun Wang
Haofu Qian
OffRL
AI4TS
VLM
352
4
0
28 Apr 2025
TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Daocheng Fu
Zijun Chen
Renqiu Xia
Zijun Chen
Qi Liu
...
Ding Wang
Junchi Yan
Botian Shi
Yu Qiao
Bo Zhang
LRM
488
4
0
22 Apr 2025
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark
Enxin Song
Wenhao Chai
Weili Xu
Jianwen Xie
Yuxuan Liu
Gaoang Wang
529
31
0
20 Apr 2025
1
2
Next
Page 1 of 2