Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.03258
Cited By
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
5 June 2023
Yochai Yemini
Aviv Shamsian
Lior Bracha
Sharon Gannot
Ethan Fetaya
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"
14 / 14 papers shown
Title
Towards Film-Making Production Dialogue, Narration, Monologue Adaptive Moving Dubbing Benchmarks
Chaoyi Wang
Junjie Zheng
Zihao Chen
Shiyu Xia
Chaofan Ding
Xiaohao Zhang
Xi Tao
Xiaoming He
Xinhan Di
AuLLM
114
0
0
30 Apr 2025
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
J. Choi
Ji-Hoon Kim
Kim Sung-Bin
Tae-Hyun Oh
Joon Son Chung
DiffM
49
0
0
29 Apr 2025
Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis
Radek Daněček
Carolin Schmitt
Senya Polikovsky
Michael J. Black
34
0
0
18 Apr 2025
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech
Ji-Hoon Kim
Jeongsoo Choi
Jaehun Kim
Chaeyoung Jung
Joon Son Chung
CVBM
48
1
0
21 Mar 2025
Shushing! Let's Imagine an Authentic Speech from the Silent Video
Jiaxin Ye
Hongming Shan
DiffM
VGen
66
1
0
19 Mar 2025
NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing
Yifan Liang
Fangkun Liu
Andong Li
Xiaodong Li
C. Zheng
47
1
0
17 Feb 2025
DiFiC: Your Diffusion Model Holds the Secret to Fine-Grained Clustering
Ruohong Yang
Peng Hu
Xi Peng
Xiting Liu
Yunfan Li
34
0
0
25 Dec 2024
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
Gaoxiang Cong
Jiadong Pan
Liang-Sheng Li
Yuankai Qi
Yuxin Peng
A. Hengel
Jian Yang
Qingming Huang
90
6
0
12 Dec 2024
Diffusion-based Unsupervised Audio-visual Speech Enhancement
Jean-Eudes Ayilo
Mostafa Sadeghi
Romain Serizel
Xavier Alameda-Pineda
DiffM
20
0
0
04 Oct 2024
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Soujanya Poria
140
143
0
24 Apr 2023
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
M. Pantic
VLM
112
144
0
26 Feb 2022
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
185
198
0
08 Jan 2021
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
220
239
0
25 Sep 2019
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
216
2,233
0
14 Jun 2018
1