Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.16584
Cited By
Audio-visual video-to-speech synthesis with synthesized input audio
31 July 2023
Triantafyllos Kefalas
Yannis Panagakis
M. Pantic
VGen
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audio-visual video-to-speech synthesis with synthesized input audio"
7 / 7 papers shown
Title
DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation
Haomin Zhang
Chang Liu
Junjie Zheng
Zihao Chen
Chaofan Ding
Xinhan Di
DiffM
VGen
83
0
0
28 Mar 2025
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
M. Pantic
VLM
112
95
0
26 Feb 2022
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
79
221
0
12 Feb 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
185
196
0
08 Jan 2021
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
214
2,224
0
14 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
199
817
0
12 Jun 2018
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola
Jun-Yan Zhu
Tinghui Zhou
Alexei A. Efros
SSeg
212
19,191
0
21 Nov 2016
1