On the robustness of ChatGPT in teaching Korean Mathematics

17 February 2025

Abstract

ChatGPT, an Artificial Intelligence model, has the potential to revolutionize education. However, its effectiveness in solving non-English questions remains uncertain. This study evaluates ChatGPT's robustness using 586 Korean mathematics questions. ChatGPT achieves 66.72% accuracy, correctly answering 391 out of 586 questions. We also assess its ability to rate mathematics questions based on eleven criteria and perform a topic analysis. Our findings show that ChatGPT's ratings align with educational theory and test-taker perspectives. While ChatGPT performs well in question classification, it struggles with non-English contexts, highlighting areas for improvement. Future research should address linguistic biases and enhance accuracy across diverse languages. Domain-specific optimizations and multilingual training could improve ChatGPT's role in personalized education.

View on arXiv

@article{nguyen2025_2502.11915,
  title={ On the robustness of ChatGPT in teaching Korean Mathematics },
  author={ Phuong-Nam Nguyen and Quang Nguyen-The and An Vu-Minh and Diep-Anh Nguyen and Xuan-Lam Pham },
  journal={arXiv preprint arXiv:2502.11915},
  year={ 2025 }
}

Comments on this paper