
Title |
|---|
![]() Object Detection with Multimodal Large Vision-Language Models: An In-depth ReviewInformation Fusion (Inf. Fusion), 2025 |
![]() VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC VideosAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
![]() Directional Gradient Projection for Robust Fine-Tuning of Foundation ModelsInternational Conference on Learning Representations (ICLR), 2025 |
![]() Patent Figure Classification using Large Vision-language ModelsEuropean Conference on Information Retrieval (ECIR), 2025 |
![]() VILA-M3: Enhancing Vision-Language Models with Medical Expert KnowledgeComputer Vision and Pattern Recognition (CVPR), 2024 Vishwesh Nath Wenqi Li Dong Yang Andriy Myronenko Mingxin Zheng ...Holger Roth Daguang Xu Baris Turkbey Holger Roth Daguang Xu |