People are poorly equipped to detect AI-powered voice clones

Scientific Reports (Sci Rep), 2024

28 January 2025

Main:8 Pages

2 Figures

Bibliography:1 Pages

4 Tables

Appendix:2 Pages

Abstract

As generative artificial intelligence (AI) continues its ballistic trajectory, everything from text to audio, image, and video generation continues to improve at mimicking human-generated content. Through a series of perceptual studies, we report on the realism of AI-generated voices in terms of identity matching and naturalness. We find human participants cannot consistently identify recordings of AI-generated voices. Specifically, participants perceived the identity of an AI-voice to be the same as its real counterpart approximately 80% of the time, and correctly identified a voice as AI generated only about 60% of the time.

View on arXiv

Comments on this paper