Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2204.02090
Cited By
v1
v2 (latest)
VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
Interspeech (Interspeech), 2022
5 April 2022
V. S. Kadandale
Juan F. Montesinos
G. Haro
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices"
11 / 11 papers shown
SyncLipMAE: Contrastive Masked Pretraining for Audio-Visual Talking-Face Representation
Zeyu Ling
Xiaodong Gu
Jiangnan Tang
Changqing Zou
CLIP
190
0
0
11 Oct 2025
Mask-Free Audio-driven Talking Face Generation for Enhanced Visual Quality and Identity Preservation
Dogucan Yaman
Fevziye Irem Eyiokur
Leonard Barmann
H. K. Ekenel
Alexander H. Waibel
CVBM
246
1
0
28 Jul 2025
SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Marco Comunità
R. F. Gramaccioni
Emilian Postolache
Emanuele Rodolà
Danilo Comminiello
Joshua D. Reiss
DiffM
291
31
0
23 Oct 2023
GestSync: Determining who is speaking without a talking head
British Machine Vision Conference (BMVC), 2023
Sindhu B. Hegde
Andrew Zisserman
210
2
0
08 Oct 2023
Speech inpainting: Context-based speech synthesis guided by video
Interspeech (Interspeech), 2023
Juan F. Montesinos
Daniel Michelsanti
G. Haro
Zheng-Hua Tan
Jesper Jensen
319
6
0
01 Jun 2023
Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Antoni Bigata Casademunt
Rodrigo Mira
Nikita Drobyshev
Konstantinos Vougioukas
Stavros Petridis
Maja Pantic
DiffM
288
2
0
15 May 2023
ModEFormer: Modality-Preserving Embedding for Audio-Video Synchronization using Transformers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Akash Gupta
Rohun Tripathi
Won-Kap Jang
283
9
0
21 Mar 2023
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
IEEE International Conference on Computer Vision (ICCV), 2022
Zhentao Yu
Zixin Yin
Deyu Zhou
Duomin Wang
Finn Wong
Baoyuan Wang
DiffM
248
60
0
07 Dec 2022
Multimodal Transformer Distillation for Audio-Visual Synchronization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuan-Bo Chen
Haibin Wu
Chung-Che Wang
Hung-yi Lee
J. Jang
200
8
0
27 Oct 2022
Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors
British Machine Vision Conference (BMVC), 2022
Vladimir E. Iashin
Weidi Xie
Esa Rahtu
Andrew Zisserman
182
34
0
13 Oct 2022
Deep Learning for Visual Speech Analysis: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Yike Guo
Xin Xu
M. Pietikäinen
Tianpeng Liu
VLM
373
56
0
22 May 2022
1
Page 1 of 1