CART: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Part-aware Unified Representation of Language and Skeleton for Zero-shot
Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2024 |
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data PerspectivesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |