Object Detection with Multimodal Large Vision-Language Models: An In-depth ReviewInformation Fusion (Inf. Fusion), 2025 |
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM CollaborationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025 |
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2025 |
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression ComprehensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object
DetectionIEEE Transactions on Image Processing (IEEE TIP), 2023 |