ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.06281
  4. Cited By
MMBench: Is Your Multi-modal Model an All-around Player?
v1v2v3v4 (latest)

MMBench: Is Your Multi-modal Model an All-around Player?

European Conference on Computer Vision (ECCV), 2023
12 July 2023
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
Wangbo Zhao
Yike Yuan
Yuan Liu
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)

Papers citing "MMBench: Is Your Multi-modal Model an All-around Player?"

50 / 685 papers shown
ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models
ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models
Jewon Lee
Wooksu Shin
Seungmin Yang
Ki-Ung Song
Donguk Lim
Jaeyeon Kim
Tae-Ho Kim
Bo-Kyeong Kim
LRMVLM
103
0
0
26 Sep 2025
Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching
Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching
Zhengyan Wan
Yidong Ouyang
Liyan Xie
Fang Fang
Hongyuan Zha
Guang Cheng
159
0
0
26 Sep 2025
Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety
Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety
Junliang Liu
Jingyu Xiao
Wenxin Tang
Wenxuan Wang
Zhixian Wang
Minrui Zhang
Shuanghe Yu
LRM
127
1
0
26 Sep 2025
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
Teng Xiao
Zuchao Li
Lefei Zhang
178
1
0
23 Sep 2025
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
S. Yu
Yuxin Chen
Hao Ju
Lianjie Jia
Fuxi Zhang
...
Lin Song
Lijun Wang
Yanwei Li
Y. Shan
Huchuan Lu
LRM
319
9
0
23 Sep 2025
Rule Encoding and Compliance in Large Language Models: An Information-Theoretic Analysis
Rule Encoding and Compliance in Large Language Models: An Information-Theoretic Analysis
Joachim Diederich
204
0
0
23 Sep 2025
Reading Images Like Texts: Sequential Image Understanding in Vision-Language Models
Reading Images Like Texts: Sequential Image Understanding in Vision-Language Models
Yueyan Li
Chenggong Zhao
Zeyuan Zang
Caixia Yuan
Xiaojie Wang
VLM
126
0
0
23 Sep 2025
BaseReward: A Strong Baseline for Multimodal Reward Model
BaseReward: A Strong Baseline for Multimodal Reward Model
Yi-Fan Zhang
HaiHua Yang
Huanyu Zhang
Yang Shi
Z. Chen
...
Xu Wang
Jianfei Pan
Haotian Wang
Zhang Zhang
Liang Wang
OffRL
128
1
0
19 Sep 2025
ORIC: Benchmarking Object Recognition under Contextual Incongruity in Large Vision-Language Models
ORIC: Benchmarking Object Recognition under Contextual Incongruity in Large Vision-Language Models
Zhaoyang Li
Z. Ling
Yuchen Zhou
Litian Gong
Erdem Bıyık
H. Su
211
0
0
19 Sep 2025
Pyramid Token Pruning for High-Resolution Large Vision-Language Models via Region, Token, and Instruction-Guided Importance
Pyramid Token Pruning for High-Resolution Large Vision-Language Models via Region, Token, and Instruction-Guided Importance
Yuxuan Liang
Xu Li
Xiaolei Chen
Yi Zheng
Haotian Chen
Bin Li
Xiangyang Xue
VLM
152
0
0
19 Sep 2025
Qianfan-VL: Domain-Enhanced Universal Vision-Language Models
Qianfan-VL: Domain-Enhanced Universal Vision-Language Models
Daxiang Dong
Mingming Zheng
Dong Xu
Bairong Zhuang
W. Zhang
...
Ruchang Yao
Ziye Yuan
J. Wu
Guangjun Xie
Dou Shen
VLM
95
1
0
19 Sep 2025
Towards Rationale-Answer Alignment of LVLMs via Self-Rationale Calibration
Towards Rationale-Answer Alignment of LVLMs via Self-Rationale Calibration
Yuanchen Wu
Ke Yan
Shouhong Ding
Ziyin Zhou
Xiaoqiang Li
LRM
102
0
0
17 Sep 2025
Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models
Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models
Weihang Wang
Xinhao Li
Ziyue Wang
Yan Pang
Jielei Zhang
Peiyi Li
Qiang Zhang
Longwen Gao
VLM
168
1
0
17 Sep 2025
SAIL-VL2 Technical Report
SAIL-VL2 Technical Report
Weijie Yin
Yongjie Ye
Fangxun Shu
Yue Liao
Zijian Kang
...
Han Wang
Wenzhuo Liu
Xiao Liang
Shuicheng Yan
Chao Feng
LRMVLM
293
4
0
17 Sep 2025
HERO: Rethinking Visual Token Early Dropping in High-Resolution Large Vision-Language Models
HERO: Rethinking Visual Token Early Dropping in High-Resolution Large Vision-Language Models
Xu Li
Yuxuan Liang
Xiaolei Chen
Yi Zheng
Haotian Chen
Bin Li
Xiangyang Xue
VLM
185
0
0
16 Sep 2025
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe
Tianyu Yu
Zefan Wang
Chongyi Wang
Fuwei Huang
Wenshuo Ma
...
Ning Ding
Xu Han
Xingtai Lv
Zhiyuan Liu
Maosong Sun
MLLMVLM
197
22
0
16 Sep 2025
AsyMoE: Leveraging Modal Asymmetry for Enhanced Expert Specialization in Large Vision-Language Models
AsyMoE: Leveraging Modal Asymmetry for Enhanced Expert Specialization in Large Vision-Language Models
Heng Zhang
Haichuan Hu
Yaomin Shen
Weihao Yu
Yilei Yuan
...
Zijian Zhang
Lubin Gan
Huihui Wei
Hao Zhang
Jin Huang
MoE
333
0
0
16 Sep 2025
The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations
The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations
Yubo Zhu
Dongrui Liu
Zecheng Lin
Wei Tong
Sheng Zhong
Jing Shao
122
2
0
16 Sep 2025
MVQA-68K: A Multi-dimensional and Causally-annotated Dataset with Quality Interpretability for Video Assessment
MVQA-68K: A Multi-dimensional and Causally-annotated Dataset with Quality Interpretability for Video Assessment
Yanyun Pu
Kehan Li
Zeyi Huang
Zhijie Zhong
Kaixiang Yang
VGen
116
0
0
15 Sep 2025
MindVL: Towards Efficient and Effective Training of Multimodal Large Language Models on Ascend NPUs
MindVL: Towards Efficient and Effective Training of Multimodal Large Language Models on Ascend NPUs
Feilong Chen
Y. Liu
Yi Huang
Hao Wang
Miren Tian
Ya-Qi Yu
Minghui Liao
Jihao Wu
MLLMVLM
318
1
0
15 Sep 2025
Seeing is Not Understanding: A Benchmark on Perception-Cognition Disparities in Large Language Models
Seeing is Not Understanding: A Benchmark on Perception-Cognition Disparities in Large Language Models
Haokun Li
Yazhou Zhang
Jizhi Ding
Qiuchi Li
Peng Zhang
155
0
0
14 Sep 2025
The Telephone Game: Evaluating Semantic Drift in Unified Models
The Telephone Game: Evaluating Semantic Drift in Unified Models
Sabbir Mollah
Rohit Gupta
S. Swetha
Qingyang Liu
Ahnaf Munir
Mubarak Shah
VLM
167
1
0
04 Sep 2025
Mitigating Multimodal Hallucinations via Gradient-based Self-Reflection
Mitigating Multimodal Hallucinations via Gradient-based Self-Reflection
Shan Wang
Maying Shen
Nadine Chang
Chuong H. Nguyen
Hongdong Li
J. Álvarez
259
0
0
03 Sep 2025
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
Han Li
Xinyu Peng
Y. Wang
Zelin Peng
Xin Chen
Rongxiang Weng
Jingang Wang
Xunliang Cai
Wenrui Dai
Hongkai Xiong
MLLMOffRL
364
12
0
03 Sep 2025
VLMs-in-the-Wild: Bridging the Gap Between Academic Benchmarks and Enterprise Reality
VLMs-in-the-Wild: Bridging the Gap Between Academic Benchmarks and Enterprise Reality
Srihari Bandraupalli
Anupam Purwar
VLM
59
1
0
03 Sep 2025
Understanding Space Is Rocket Science -- Only Top Reasoning Models Can Solve Spatial Understanding Tasks
Understanding Space Is Rocket Science -- Only Top Reasoning Models Can Solve Spatial Understanding Tasks
Nils Hoehing
Mayug Maniparambil
Ellen Rushe
Noel E. O'Connor
Anthony Ventresque
LRM
190
0
0
02 Sep 2025
Implicit Reasoning in Large Language Models: A Comprehensive Survey
Implicit Reasoning in Large Language Models: A Comprehensive Survey
Jindong Li
Yali Fu
Li Fan
Jiahong Liu
Yao Shu
Chengwei Qin
Menglin Yang
Irwin King
Rex Ying
OffRLLRMAI4CE
213
12
0
02 Sep 2025
Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
Junjie Chen
Xuyang Liu
Zichen Wen
Yiyu Wang
Siteng Huang
Honggang Chen
MQVLM
78
5
0
01 Sep 2025
Kwai Keye-VL 1.5 Technical Report
Kwai Keye-VL 1.5 Technical Report
Biao Yang
Bin Wen
Boyang Ding
Changyi Liu
Chenglong Chu
...
S. Wang
X. Luo
Yan Li
Yuhang Hu
Zixing Zhang
VLM
326
15
0
01 Sep 2025
Robix: A Unified Model for Robot Interaction, Reasoning and Planning
Robix: A Unified Model for Robot Interaction, Reasoning and Planning
Huang Fang
Mengxi Zhang
Heng Dong
Wei Li
Z. Wang
Qifeng Zhang
Xueyun Tian
Yucheng Hu
Hang Li
LM&RoLRM
168
7
0
01 Sep 2025
Improving Large Vision and Language Models by Learning from a Panel of Peers
Improving Large Vision and Language Models by Learning from a Panel of Peers
J. Hernandez
Jing Shi
Simon Jenni
Vicente Ordonez
Kushal Kafle
129
1
0
01 Sep 2025
Reinforced Visual Perception with Tools
Reinforced Visual Perception with Tools
Zetong Zhou
Dongping Chen
Zixian Ma
Zhihan Hu
Mingyang Fu
Sinan Wang
Yao Wan
Zhou Zhao
Ranjay Krishna
OffRLVLMLRM
155
11
0
01 Sep 2025
TrimTokenator: Towards Adaptive Visual Token Pruning for Large Multimodal Models
TrimTokenator: Towards Adaptive Visual Token Pruning for Large Multimodal Models
Hao Zhang
Mengsi Lyu
Chenrui He
Yulong Ao
Yonghua Lin
VLM
181
1
0
30 Aug 2025
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
Qi Yang
Bolin Ni
Shiming Xiang
Han Hu
Houwen Peng
Jie Jiang
LRM
187
5
0
28 Aug 2025
Improving Alignment in LVLMs with Debiased Self-Judgment
Improving Alignment in LVLMs with Debiased Self-Judgment
Sihan Yang
Chenhang Cui
Zihao Zhao
Yiyang Zhou
Weilong Yan
Ying Wei
Huaxiu Yao
209
0
0
28 Aug 2025
SUMMA: A Multimodal Large Language Model for Advertisement Summarization
SUMMA: A Multimodal Large Language Model for Advertisement Summarization
Weitao Jia
Shuo Yin
Zhoufutu Wen
Han Wang
Zehui Dai
Kun Zhang
Zhenyu Li
Tao Zeng
Xiaohui Lv
130
0
0
28 Aug 2025
KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts
KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts
Taebaek Hwang
Minseo Kim
Gisang Lee
Seonuk Kim
Hyunjun Eun
VLM
151
0
0
27 Aug 2025
PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
Nanxi Li
Zhengyue Zhao
Chaowei Xiao
LRM
60
0
0
26 Aug 2025
MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs
MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs
Sixun Dong
Juhua Hu
Mian Zhang
Ming Yin
Yanjie Fu
Qi Qian
111
4
0
25 Aug 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Weiyun Wang
Zhangwei Gao
Lixin Gu
Hengjun Pu
Long Cui
...
Bowen Zhou
Kai Chen
Yu Qiao
Wenhai Wang
Gen Luo
MLLMLRM
294
265
0
25 Aug 2025
VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference
VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference
Pengfei Jiang
Hanjun Li
Linglan Zhao
Fei Chao
Ke Yan
Shouhong Ding
Rongrong Ji
116
2
0
25 Aug 2025
Language-Specific Layer Matters: Efficient Multilingual Enhancement for Large Vision-Language Models
Language-Specific Layer Matters: Efficient Multilingual Enhancement for Large Vision-Language Models
Yuchun Fan
Yilin Wang
Yongyu Mu
Daigang Xu
Bei Li
Xiaocheng Feng
Tong Xiao
Jingbo Zhu
96
0
0
25 Aug 2025
AVAM: Universal Training-free Adaptive Visual Anchoring Embedded into Multimodal Large Language Model for Multi-image Question Answering
AVAM: Universal Training-free Adaptive Visual Anchoring Embedded into Multimodal Large Language Model for Multi-image Question Answering
Kang Zeng
Guojin Zhong
Jintao Cheng
Jin Yuan
Zhiyong Li
135
0
0
25 Aug 2025
Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance
Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance
Xiangxiang Wang
Xuanyu Wang
YiJia Luo
Yongbin Yu
Manping Fan
Jingtao Zhang
Liyong Ren
118
1
0
25 Aug 2025
Explain Before You Answer: A Survey on Compositional Visual Reasoning
Explain Before You Answer: A Survey on Compositional Visual Reasoning
Fucai Ke
Joy Hsu
Zhixi Cai
Zixian Ma
Xin Zheng
...
P. D. Haghighi
Gholamreza Haffari
Ranjay Krishna
Jiajun Wu
H. Rezatofighi
ReLMCoGeLRM
355
8
0
24 Aug 2025
Towards Open World Detection: A Survey
Towards Open World Detection: A Survey
Andrei-Stefan Bulzan
Cosmin Cernazanu-Glavan
ObjDVLM
215
0
0
22 Aug 2025
Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation
Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation
Yichi Zhang
Yao Huang
Yifan Wang
Yitong Sun
Chang-rui Liu
...
Xiao Yang
Xingxing Wei
Hang Su
Yinpeng Dong
Jun Zhu
158
1
0
21 Aug 2025
Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models
Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models
Thanh-Dat Truong
Huu-Thien Tran
Tran Thai Son
Bhiksha Raj
Khoa Luu
292
1
0
19 Aug 2025
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
Zhongang Cai
Yubo Wang
Qingping Sun
Ruisi Wang
Chenyang Gu
...
Quan-ding Wang
Dahua Lin
Lei Yang
Dahua Lin
L. Yang
ELM
259
0
0
18 Aug 2025
RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts
RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts
Xuming He
Zhiyuan You
Junchao Gong
Couhua Liu
Xiaoyu Yue
Peiqin Zhuang
Wenlong Zhang
Wenlong Zhang
92
3
0
17 Aug 2025
Previous
12345...121314
Next