Title |
---|
![]() AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? Han Bao Yue Huang Yanbo Wang Jiayi Ye Xiangqi Wang Xiuying Chen Mohamed Elhoseiny X. Zhang Mohamed Elhoseiny Xiangliang Zhang |
![]() MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? Yi-Fan Zhang Huanyu Zhang Haochen Tian Chaoyou Fu Shuangqing Zhang ...Qingsong Wen Zhang Zhang L. Wang Rong Jin Tieniu Tan |
![]() OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang Zengzhi Wang Shijie Xia Xuefeng Li Haoyang Zou ...Yuxiang Zheng Shaoting Zhang Dahua Lin Yu Qiao Pengfei Liu |
![]() ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Cheng Yang Chufan Shi Yaxin Liu Bo Shui Junjie Wang ...Yuxiang Zhang Gongye Liu Xiaomei Nie Deng Cai Yujiu Yang |
![]() LVBench: An Extreme Long Video Understanding Benchmark Weihan Wang Zehai He Wenyi Hong Yean Cheng Xiaohan Zhang ...Shiyu Huang Bin Xu Yuxiao Dong Ming Ding Jie Tang |