
v1v2 (latest)
VHASR: A Multimodal Speech Recognition System With Vision Hotwords
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Papers citing "VHASR: A Multimodal Speech Recognition System With Vision Hotwords"
1 / 1 papers shown
Title |
|---|
![]() Locate-and-Focus: Enhancing Terminology Translation in Speech Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |

