MultiAiTutor: Child-Friendly Educational Multilingual Speech Generation Tutor with LLMs

12 August 2025

Xiaoxue Gao

Huayun Zhang

Nancy F. Chen

ArXiv (abs)PDF HTML Github

Main:4 Pages

5 Figures

Bibliography:1 Pages

1 Tables

Abstract

Generative speech models have demonstrated significant potential in personalizing teacher-student interactions, offering valuable real-world applications for language learning in children's education. However, achieving high-quality, child-friendly speech generation remains challenging, particularly for low-resource languages across diverse languages and cultural contexts. In this paper, we propose MultiAiTutor, an educational multilingual generative AI tutor with child-friendly designs, leveraging LLM architecture for speech generation tailored for educational purposes. We propose to integrate age-appropriate multilingual speech generation using LLM architectures, facilitating young children's language learning through culturally relevant image-description tasks in three low-resource languages: Singaporean-accent Mandarin, Malay, and Tamil. Experimental results from both objective metrics and subjective evaluations demonstrate the superior performance of the proposed MultiAiTutor compared to baseline methods.

View on arXiv

Comments on this paper