
![]() LangVision-LoRA-NAS: Neural Architecture Search for Variable LoRA Rank in Vision Language ModelsInternational Conference on Information Photonics (ICIP), 2025 |
![]() Finding Needles in Images: Can Multimodal LLMs Locate Fine Details?Annual Meeting of the Association for Computational Linguistics (ACL), 2025 |
![]() MAGE: Multimodal Alignment and Generation Enhancement via Bridging Visual and Semantic SpacesInternational Joint Conference on Artificial Intelligence (IJCAI), 2025 |
![]() Docopilot: Improving Multimodal Models for Document-Level UnderstandingComputer Vision and Pattern Recognition (CVPR), 2025 |
![]() WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and ChartsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |