We present a unified benchmark for mispronunciation detection in Modern Standard Arabic (MSA) using Quránic recitation as a case study. Our approach lays the groundwork for advancing Arabic pronunciation assessment by providing a comprehensive pipeline that spans data processing, the development of a specialized phoneme set tailored to the nuances of MSA pronunciation, and the creation of the first publicly available test set for this task, which we term as the Quránic Mispronunciation Benchmark (QuranMB.v1). Furthermore, we evaluate several baseline models to provide initial performance insights, thereby highlighting both the promise and the challenges inherent in assessing MSA pronunciation. By establishing this standardized framework, we aim to foster further research and development in pronunciation assessment in Arabic language technology and related applications.
View on arXiv@article{kheir2025_2506.07722, title={ Towards a Unified Benchmark for Arabic Pronunciation Assessment: Quranic Recitation as Case Study }, author={ Yassine El Kheir and Omnia Ibrahim and Amit Meghanani and Nada Almarwani and Hawau Olamide Toyin and Sadeen Alharbi and Modar Alfadly and Lamya Alkanhal and Ibrahim Selim and Shehab Elbatal and Salima Mdhaffar and Thomas Hain and Yasser Hifny and Mostafa Shahin and Ahmed Ali }, journal={arXiv preprint arXiv:2506.07722}, year={ 2025 } }