
Title |
|---|
![]() VideoGEM: Training-free Action Grounding in VideosComputer Vision and Pattern Recognition (CVPR), 2025 |
![]() Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating
Vision-Language ModelsACM Multimedia (ACM MM), 2023 |
![]() DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot LearningAAAI Conference on Artificial Intelligence (AAAI), 2022 |