Finding Needles in Images: Can Multimodal LLMs Locate Fine Details?Annual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive ReviewAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
PerSRV: Personalized Sticker Retrieval with Vision-Language ModelThe Web Conference (WWW), 2024 |
Leveraging Distillation Techniques for Document Understanding: A Case
Study with FLAN-T5Jahrestagung der Gesellschaft für Informatik (GI Jahrestagung), 2024 |
Handwritten and Printed Text Segmentation: A Signature Case StudyIEEE International Conference on Computer Vision (ICCV), 2023 |
Towards Complex Document Understanding By Discrete ReasoningACM Multimedia (ACM MM), 2022 |
DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End
Information ExtractionIEEE International Conference on Document Analysis and Recognition (ICDAR), 2021 |