Large-scale unsupervised audio pre-training for video-to-speech
synthesis

Large-scale unsupervised audio pre-training for video-to-speech synthesis

27 June 2023

Triantafyllos Kefalas

Yannis Panagakis

Papers citing "Large-scale unsupervised audio pre-training for video-to-speech synthesis"

6 / 6 papers shown

Title
Audio-visual video-to-speech synthesis with synthesized input audio Triantafyllos Kefalas Yannis Panagakis M. Pantic VGen DiffM 20 1 0 31 Jul 2023
Visual Speech Recognition for Multiple Languages in the Wild Pingchuan Ma Stavros Petridis M. Pantic VLM 112 144 0 26 Feb 2022
End-to-end Audio-visual Speech Recognition with Conformers Pingchuan Ma Stavros Petridis M. Pantic 79 224 0 12 Feb 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency Ruohan Gao Kristen Grauman CVBM 185 198 0 08 Jan 2021
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 214 2,233 0 14 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Z. Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 207 819 0 12 Jun 2018