Transfer Learning from Audio-Visual Grounding to Speech Recognition

9 July 2019

Papers citing "Transfer Learning from Audio-Visual Grounding to Speech Recognition"

5 / 5 papers shown

Title
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System Khazar Khorrami María Andrea Cruz Blandón Tuomas Virtanen Okko Rasanen SSL 27 1 0 05 Jun 2023
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations Dan Oneaţă H. Cucu 19 19 0 27 Apr 2022
Word Discovery in Visually Grounded, Self-Supervised Speech Models Puyuan Peng David Harwath SSL 20 39 0 28 Mar 2022
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling Puyuan Peng David Harwath SSL 43 26 0 07 Feb 2022
Multiresolution and Multimodal Speech Recognition with Transformers Georgios Paraskevopoulos Srinivas Parthasarathy Aparna Khare Shiva Sundaram 30 29 0 29 Apr 2020