Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.20426
Cited By
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
26 May 2025
Yunlong Tang
Pinxin Liu
Mingqian Feng
Zhangyun Tan
Rui Mao
Chao Huang
Jing Bi
Yunzhong Xiao
Susan Liang
Hang Hua
Ali Vosoughi
Luchuan Song
Zeliang Zhang
Chenliang Xu
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness"
7 / 7 papers shown
Title
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Pinxin Liu
Haiyang Liu
Luchuan Song
Chenliang Xu
SLR
61
1
0
21 May 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
219
132
1
14 Apr 2025
Towards Visual Text Grounding of Multimodal Large Language Model
Ming Li
Ruiyi Zhang
Jian Chen
Jiuxiang Gu
Yufan Zhou
Franck Dernoncourt
Wanrong Zhu
Dinesh Manocha
Tong Sun
100
3
0
07 Apr 2025
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Yunlong Tang
Jing Bi
Chao Huang
Susan Liang
Daiki Shimada
...
Jinxi He
Liu He
Zeliang Zhang
Jiebo Luo
Chenliang Xu
98
1
0
07 Apr 2025
Qwen2.5-VL Technical Report
S. Bai
Keqin Chen
Xuejing Liu
Jialin Wang
Wenbin Ge
...
Zesen Cheng
Hang Zhang
Zhibo Yang
Haiyang Xu
Junyang Lin
VLM
422
699
0
20 Feb 2025
KinMo: Kinematic-aware Human Motion Understanding and Generation
Pengfei Zhang
Pinxin Liu
Hyeongwoo Kim
Pablo Garrido
Bindita Chaudhuri
180
2
0
23 Nov 2024
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
Baiqi Li
Zhiqiu Lin
Wenxuan Peng
Jean de Dieu Nyandwi
Daniel Jiang
Zixian Ma
Simran Khanuja
Ranjay Krishna
Graham Neubig
Deva Ramanan
AAML
CoGe
VLM
212
31
0
18 Oct 2024
1