Title |
---|
![]() DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Run Luo Yunshui Li Longze Chen Wanwei He Ting-En Lin ...Zikai Song Xiaobo Xia Tongliang Liu Min Yang Binyuan Hui |
![]() Mementos: A Comprehensive Benchmark for Multimodal Large Language Model
Reasoning over Image Sequences Xiyao Wang Yuhang Zhou Xiaoyu Liu Hongjin Lu Yuancheng Xu ...Taixi Lu Gedas Bertasius Mohit Bansal Huaxiu Yao Furong Huang |