Discrete Optimal Transport and Voice Conversion
- OT
Main:3 Pages
7 Figures
Bibliography:1 Pages
1 Tables
Abstract
We propose kDOT, a discrete optimal transport (OT) framework for voice conversion (VC) operating in a pretrained speech embedding space. In contrast to the averaging strategies used in kNN-VC and SinkVC, and the independence assumption adopted in MKL, our method employs the barycentric projection of the discrete OT plan to construct a transport map between source and target speaker embedding distributions.
View on arXivComments on this paper
