70

Pronunciation Deviation Analysis Through Voice Cloning and Acoustic Comparison

Andrew Valdivia
Yueming Zhang
Hailu Xu
Amir Ghasemkhani
Xin Qin
Main:8 Pages
5 Figures
Bibliography:1 Pages
2 Tables
Abstract

This paper presents a novel approach for detecting mispronunciations by analyzing deviations between a user's original speech and their voice-cloned counterpart with corrected pronunciation. We hypothesize that regions with maximal acoustic deviation between the original and cloned utterances indicate potential mispronunciations. Our method leverages recent advances in voice cloning to generate a synthetic version of the user's voice with proper pronunciation, then performs frame-by-frame comparisons to identify problematic segments. Experimental results demonstrate the effectiveness of this approach in pinpointing specific pronunciation errors without requiring predefined phonetic rules or extensive training data for each target language.

View on arXiv
Comments on this paper