Global Performance Disparities Between English-Language Accents in Automatic Speech Recognition

Past research has identified discriminatory automatic speech recognition (ASR) performance as a function of the racial group and nationality of the speaker. In this paper, we expand the discussion beyond bias as a function of the individual national origin of the speaker to look for bias as a function of the geopolitical orientation of their nation of origin. We audit some of the most popular English language ASR services using a large and global data set of speech from The Speech Accent Archive, which includes over 2,700 speakers of English born in 171 different countries. We show that, even when controlling for multiple linguistic covariates, ASR service performance has a statistically significant relationship to the political alignment of the speaker's birth country with respect to the United States' geopolitical power. This holds for all ASR services tested. We discuss this bias in the context of the historical use of language to maintain global and political power.
View on arXiv