Multimodal Speech Recognition for Language-Guided Embodied AgentsInterspeech (Interspeech), 2023 |
AVATAR: Unconstrained Audiovisual Speech RecognitionInterspeech (Interspeech), 2022 |
Fine-Grained Grounding for Multimodal Speech RecognitionFindings (Findings), 2020 |
Looking Enhances Listening: Recovering Missing Speech Using ImagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 |