Title |
---|
![]() AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? Han Bao Yue Huang Yanbo Wang Jiayi Ye Xiangqi Wang Xiuying Chen Mohamed Elhoseiny X. Zhang Mohamed Elhoseiny Xiangliang Zhang |
![]() MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? Yi-Fan Zhang Huanyu Zhang Haochen Tian Chaoyou Fu Shuangqing Zhang ...Qingsong Wen Zhang Zhang L. Wang Rong Jin Tieniu Tan |