Visually grounded few-shot word learning in low-resource settingsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023 |
Visually grounded few-shot word acquisition with fewer shotsInterspeech (Interspeech), 2023 |
Towards visually prompted keyword localisation for zero-resource spoken
languagesSpoken Language Technology Workshop (SLT), 2022 |
YFACC: A Yorùbá speech-image dataset for cross-lingual keyword
localisation through visual groundingSpoken Language Technology Workshop (SLT), 2022 |
Keyword localisation in untranscribed speech using visually grounded
speech modelsIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022 |
Direct multimodal few-shot learning of speech and imagesInterspeech (Interspeech), 2020 |