Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2307.06281
Cited By
v1
v2
v3
v4 (latest)
MMBench: Is Your Multi-modal Model an All-around Player?
European Conference on Computer Vision (ECCV), 2023
12 July 2023
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
Wangbo Zhao
Yike Yuan
Yuan Liu
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (5 upvotes)
Papers citing
"MMBench: Is Your Multi-modal Model an All-around Player?"
50 / 654 papers shown
Title
HBridge: H-Shape Bridging of Heterogeneous Experts for Unified Multimodal Understanding and Generation
Xiang Wang
Zhifei Zhang
Chentao Song
Zhe Lin
Yuqian Zhou
...
Haitian Zheng
Jason Kuen
Yuehuan Wang
Changxin Gao
Nong Sang
MoE
69
0
0
25 Nov 2025
Parallel Vision Token Scheduling for Fast and Accurate Multimodal LMMs Inference
Wengyi Zhan
Mingbao Lin
Zhihang Lin
Rongrong Ji
MLLM
VLM
LRM
152
0
0
24 Nov 2025
Robot-Powered Data Flywheels: Deploying Robots in the Wild for Continual Data Collection and Foundation Model Adaptation
J. Grannen
Michelle Pan
Kenneth Llontop
Cherie Ho
Mark Zolotas
Jeannette Bohg
Dorsa Sadigh
LM&Ro
202
0
0
24 Nov 2025
ConsistCompose: Unified Multimodal Layout Control for Image Composition
Xuanke Shi
B. Li
Xiaoyang Han
Zhongang Cai
Lei Yang
Dahua Lin
Quan-ding Wang
MLLM
206
0
0
23 Nov 2025
AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert
Yuting Gao
Wang Lan
Hengyuan Zhao
Linjiang Huang
Si Liu
Q. Guo
MoE
96
0
0
23 Nov 2025
RoadBench: Benchmarking MLLMs on Fine-Grained Spatial Understanding and Reasoning under Urban Road Scenarios
Jun Zhang
Jie Feng
Long Chen
Junhui Wang
Zhicheng Liu
Depeng Jin
Yong Li
LRM
48
0
0
22 Nov 2025
FastMMoE: Accelerating Multimodal Large Language Models through Dynamic Expert Activation and Routing-Aware Token Pruning
Guoyang Xia
Yifeng Ding
Fengfa Li
Lei Ren
Wei Chen
Fangxiang Feng
Xiaojie Wang
MoE
VLM
72
0
0
22 Nov 2025
Learning to Think Fast and Slow for Visual Language Models
Chenyu Lin
Cheng Chi
Jinlin Wu
Sharon Li
Kaiyang Zhou
ReLM
VLM
161
0
0
20 Nov 2025
VLA-Pruner: Temporal-Aware Dual-Level Visual Token Pruning for Efficient Vision-Language-Action Inference
Ziyan Liu
Y. Chen
Hongyi Cai
Tao Lin
Shuo Yang
Zheng Liu
Bo Zhao
VLM
215
0
0
20 Nov 2025
MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping
Yushi Huang
Z. Wang
Zhihang Yuan
Yifu Ding
Ruihao Gong
Jinyang Guo
Xianglong Liu
Jun Zhang
MoE
VLM
156
0
0
19 Nov 2025
Multimodal Evaluation of Russian-language Architectures
Artem Chervyakov
Ulyana Isaeva
Anton A. Emelyanov
Artem Safin
Maria Tikhonova
...
Ilseyar Alimova
Ilseyar Alimova
A. Kapitanov
Alena Fenogenova
Alena Fenogenova
186
1
0
19 Nov 2025
A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models
Duo Li
Zuhao Yang
Xiaoqin Zhang
Ling Shao
Shijian Lu
VLM
113
1
0
19 Nov 2025
First Frame Is the Place to Go for Video Content Customization
Jingxi Chen
Z. Li
Zhichao Liu
Guangyao Shi
Xiyang Wu
Fuxiao Liu
Cornelia Fermüller
Brandon Yushan Feng
Yiannis Aloimonos
DiffM
VGen
141
0
0
19 Nov 2025
When to Think and When to Look: Uncertainty-Guided Lookback
Jing Bi
Filippos Bellos
Junjia Guo
Yayuan Li
Chao Huang
...
Tang
Luchuan Song
Susan Liang
Zhongfei
Zhang
LRM
197
0
0
19 Nov 2025
FlexiCup: Wireless Multimodal Suction Cup with Dual-Zone Vision-Tactile Sensing
Junhao Gong
Shoujie Li
Kit-Wa Sou
Changqing Guo
Hourong Huang
...
Yifan Xie
Chenxin Liang
Chuqiao Lyu
Xiaojun Liang
Wenbo Ding
100
1
0
18 Nov 2025
Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution
N Dinesh Reddy
Dylan Snyder
Lona Kiragu
Mirajul Mohin
Shahrear Bin Amin
Sudeep Pillai
28
0
0
18 Nov 2025
CreBench: Human-Aligned Creativity Evaluation from Idea to Process to Product
Kaiwen Xue
Chenglong Li
Zhonghong Ou
Guoxin Zhang
Kaoyan Lu
...
Xinyu Liu
Qunlin Chen
Weiwei Qin
Yiran Shen
Jiayi Cen
68
0
0
17 Nov 2025
Explore How to Inject Beneficial Noise in MLLMs
Ruishu Zhu
Sida Huang
Ziheng Jiao
Hongyuan Zhang
80
2
0
17 Nov 2025
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data
Yunxin Li
Xinyu Chen
Shenyuan Jiang
Haoyuan Shi
Zhenyu Liu
...
Zhenran Xu
Yicheng Ma
Meishan Zhang
Baotian Hu
Min Zhang
MLLM
MoE
OSLM
VLM
451
0
0
16 Nov 2025
RedVTP: Training-Free Acceleration of Diffusion Vision-Language Models Inference via Masked Token-Guided Visual Token Pruning
Jingqi Xu
Jingxi Lu
Chenghao Li
Sreetama Sarkar
Souvik Kundu
Peter A. Beerel
VLM
148
0
0
16 Nov 2025
BridgeEQA: Virtual Embodied Agents for Real Bridge Inspections
Subin Varghese
Joshua Gao
Asad Ur Rahman
Vedhus Hoskere
100
0
0
16 Nov 2025
D
3
^{3}
3
ToM: Decider-Guided Dynamic Token Merging for Accelerating Diffusion MLLMs
Shuochen Chang
Xiaofeng Zhang
Qingyang Liu
Li Niu
56
0
0
15 Nov 2025
TopoPerception: A Shortcut-Free Evaluation of Global Visual Perception in Large Vision-Language Models
Wenhao Zhou
Hao Zheng
R. Zhao
MLLM
VLM
LRM
132
0
0
14 Nov 2025
MACEval: A Multi-Agent Continual Evaluation Network for Large Models
Z. Chen
Yuze Sun
Yuan Tian
Wenjun Zhang
Guangtao Zhai
ALM
ELM
116
0
0
12 Nov 2025
Learning with Preserving for Continual Multitask Learning
H. Wang
Siwoo Bae
Zirong Chen
Meiyi Ma
CLL
140
0
0
11 Nov 2025
RPTS: Tree-Structured Reasoning Process Scoring for Faithful Multimodal Evaluation
Haofeng Wang
Yu Zhang
LRM
48
0
0
10 Nov 2025
Unveiling Modality Bias: Automated Sample-Specific Analysis for Multimodal Misinformation Benchmarks
Hehai Lin
Hui Liu
S. Cao
Jing Li
Haoliang Li
Wenya Wang
140
0
0
08 Nov 2025
Visual Spatial Tuning
Rui Yang
Ziyu Zhu
Yanwei Li
Jingjia Huang
Shen Yan
...
Xiangtai Li
S. Li
Wenqian Wang
Yi Lin
Hengshuang Zhao
VLM
285
4
0
07 Nov 2025
Cambrian-S: Towards Spatial Supersensing in Video
Shusheng Yang
J. Yang
Pinzhi Huang
Ellis L Brown
Zihao Yang
...
Daohan Lu
Rob Fergus
Yann LeCun
Li Fei-Fei
Saining Xie
76
8
0
06 Nov 2025
Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts
Ellis L Brown
Jihan Yang
Shusheng Yang
Rob Fergus
Saining Xie
VLM
190
4
0
06 Nov 2025
IndicVisionBench: Benchmarking Cultural and Multilingual Understanding in VLMs
Ali Faraz
Akash
Shaharukh Khan
Raja Kolla
Akshat Patidar
Suranjan Goswami
Abhinav Ravi
Chandra Khatri
Shubham Agarwal
VLM
112
0
0
06 Nov 2025
NVIDIA Nemotron Nano V2 VL
Nvidia
Amala Sanjay Deshmukh
Kateryna Chumachenko
Tuomas Rintamaki
Matthieu Le
...
Krzysztof Pawelec
Michael Evans
Katherine Luna
Jie Lou
Erick Galinkin
VLM
236
1
0
06 Nov 2025
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Jingqi Tong
Yurong Mou
Hangcheng Li
Mingzhe Li
Y. Yang
...
Y. Zheng
Xinchi Chen
Jun Zhao
Xuanjing Huang
Xipeng Qiu
VGen
LRM
281
6
0
06 Nov 2025
MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity
Kaiyuan Zhang
Chenghao Yang
Zhoufutu Wen
Sihang Yuan
Q. Wang
...
Ge Zhang
Yi Lin
Guang Shi
Chaoyou Fu
Wenhao Huang
LRM
164
0
0
05 Nov 2025
Contamination Detection for VLMs using Multi-Modal Semantic Perturbation
J. Park
Mu Cai
Feng Yao
Jingbo Shang
Soochahn Lee
Yong Jae Lee
AAML
VLM
56
0
0
05 Nov 2025
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
Kuei-Chun Kao
Hsu Tzu-Yin
Yunqi Hong
Ruochen Wang
Cho-Jui Hsieh
LRM
56
0
0
05 Nov 2025
Can Visual Input Be Compressed? A Visual Token Compression Benchmark for Large Multimodal Models
Tianfan Peng
Yuntao Du
Pengzhou Ji
Shijie Dong
Kailin Jiang
...
Jinhe Bi
Qian Li
Wei Du
Feng Xiao
Lizhen Cui
VLM
172
0
0
04 Nov 2025
CoCoVa: Chain of Continuous Vision-Language Thought for Latent Space Reasoning
Jizheng Ma
Xiaofei Zhou
Yanlong Song
Han Yan
VLM
LRM
129
0
0
04 Nov 2025
Dynamic Reflections: Probing Video Representations with Text Alignment
Tyler Zhu
Tengda Han
Leonidas Guibas
Viorica Patraucean
M. Ovsjanikov
VGen
209
0
0
04 Nov 2025
SAIL-RL: Guiding MLLMs in When and How to Think via Dual-Reward RL Tuning
Fangxun Shu
Yongjie Ye
Yue Liao
Zijian Kang
Weijie Yin
Jiacong Wang
Xiao Liang
Shuicheng Yan
Chao Feng
OffRL
ReLM
LRM
189
1
0
04 Nov 2025
Dynamic Routing Between Experts: A Data-Efficient Approach to Continual Learning in Vision-Language Models
Jay Mohta
Kenan E. Ak
Dimitrios Dimitriadis
Yan Xu
Mingwei Shen
CLL
VLM
222
0
0
03 Nov 2025
ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation
Yongyuan Liang
Wei Chow
Feng Li
Ziqiao Ma
Xiyao Wang
Jiageng Mao
Jiuhai Chen
Jiatao Gu
Y. Wang
Furong Huang
LRM
160
0
0
03 Nov 2025
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning
Ming Li
Jike Zhong
Shitian Zhao
H. Zhang
Shaoheng Lin
Yuxiang Lai
Chen Wei
Konstantinos Psounis
Kaipeng Zhang
EGVM
LRM
VLM
380
2
0
03 Nov 2025
Rethinking Facial Expression Recognition in the Era of Multimodal Large Language Models: Benchmark, Datasets, and Beyond
Fan Zhang
Haoxuan Li
Shengju Qian
Xin Wang
Zheng Lian
...
Yuan Gao
Qiankun Li
Yefeng Zheng
Zhouchen Lin
Pheng-Ann Heng
LRM
88
0
0
01 Nov 2025
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
Yuhong Liu
Beichen Zhang
Yuhang Zang
Yuhang Cao
Long Xing
Xiaoyi Dong
Haodong Duan
Dahua Lin
J. Wang
LRM
97
2
0
31 Oct 2025
MM-OPERA: Benchmarking Open-ended Association Reasoning for Large Vision-Language Models
Zimeng Huang
Jinxin Ke
Xiaoxuan Fan
Yufeng Yang
Yang Liu
...
Junteng Dai
Haoyi Jiang
Y. Zhou
Keze Wang
Z. Chen
LRM
VLM
231
0
0
30 Oct 2025
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence
Zihan Liu
Zhikang Niu
Qiuyang Xiao
Zhisheng Zheng
Ruoqi Yuan
...
Jianze Liang
Xie Chen
Leilei Sun
Dahua Lin
Jiaqi Wang
AuLLM
LRM
355
2
0
28 Oct 2025
BLM
1
_1
1
: A Boundless Large Model for Cross-Space, Cross-Task, and Cross-Embodiment Learning
Wentao Tan
Bowen Wang
Heng Zhi
Chenyu Liu
Z. Li
...
Chen Xu
Zhibin Wang
Tianshi Wang
Lei Zhu
Heng Tao Shen
LM&Ro
79
0
0
28 Oct 2025
Revisiting Multimodal Positional Encoding in Vision-Language Models
Jie Huang
Xuejing Liu
Sibo Song
Ruibing Hou
Hong Chang
Junyang Lin
S. Bai
112
1
0
27 Oct 2025
LightFusion: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation
Zeyu Wang
Z. Chen
Chenhui Gou
Feng Li
Chaorui Deng
...
Kunchang Li
Weihao Yu
Haoqin Tu
Haoqi Fan
Cihang Xie
202
0
0
27 Oct 2025
1
2
3
4
...
12
13
14
Next