Visually grounded cross-lingual keyword spotting in speech

13 June 2018

Papers citing "Visually grounded cross-lingual keyword spotting in speech"

8 / 8 papers shown

Title
Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples H. Ryu Arda Senocak In So Kweon Joon Son Chung VLM 30 8 0 30 Mar 2023
Towards visually prompted keyword localisation for zero-resource spoken languages Leanne Nortje Herman Kamper 29 6 0 12 Oct 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 137 354 0 21 May 2022
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling Puyuan Peng David Harwath SSL 43 26 0 07 Feb 2022
Keyword localisation in untranscribed speech using visually grounded speech models Kayode Olaleye Dan Oneaţă Herman Kamper 32 7 0 02 Feb 2022
Cascaded Multilingual Audio-Visual Learning from Videos Andrew Rouditchenko Angie Boggust David Harwath Samuel Thomas Hilde Kuehne ... Yikang Shen Rogerio Feris Brian Kingsbury M. Picheny James R. Glass 137 8 0 08 Nov 2021
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos Andrew Rouditchenko Angie Boggust David Harwath Brian Chen D. Joshi ... Rogerio Feris Brian Kingsbury M. Picheny Antonio Torralba James R. Glass SSL 22 141 0 16 Jun 2020
End-to-End Automatic Speech Translation of Audiobooks Alexandre Berard Laurent Besacier A. Kocabiyikoglu Olivier Pietquin 83 190 0 12 Feb 2018