
Title |
|---|
![]() Can Vision Language Models Understand Mimed Actions?Annual Meeting of the Association for Computational Linguistics (ACL), 2025 |
![]() TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic VideosAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
![]() Video-Bench: Human-Aligned Video Generation BenchmarkComputer Vision and Pattern Recognition (CVPR), 2025 |
![]() Urban Computing in the Era of Large Language ModelsACM Transactions on Intelligent Systems and Technology (TIST), 2025 |
![]() CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2025 |