A New Fusion Strategy for Spoofing Aware Speaker Verification
The performance of automatic speaker verification (ASV) systems could be degraded by voice spoofing attacks. Most existing works aimed to develop standalone spoofing countermeasure (CM) systems. Relatively little work aimed to develop an integrated spoofing aware speaker verification (SASV) system. With the recent SASV challenge aiming to encourage the development of such integration, official protocols and baselines have been released by the organizers. Building on these baselines, we assume a conditional independent relation between the ASV and CM subsystems and propose a new fusion strategy for inference and training based on a probability framework. Surprisingly, these strategies significantly improve the SASV equal error rate (EER) from 19.31% of the baseline to 1.58% on the official evaluation trials of the SASV challenge. We verify the effectiveness of our proposed components through ablation studies and provide insights with score distribution analysis.
View on arXiv