While closed-source Large Language Models (LLMs) demonstrate strong mathematical problem-solving abilities, open-source models still face challenges with such tasks. To bridge this gap, we propose a data augmentation approach and introduce PersonaMathQA, a dataset derived from MATH and GSM8K, on which we train the PersonaMath models. Our approach consists of two stages: the first stage focuses on learning from Persona Diversification, and the second stage emphasizes learning from Reflection. In the first stage, we regenerate detailed chain-of-thought (CoT) solutions as instructions using a closed-source LLM and introduce a persona-driven data augmentation technique. This technique innovatively classifies personas based on occupations, significantly enhancing the dataset's diversity and quality. In the second stage, we incorporate reflection to fully leverage more challenging and valuable questions. Evaluation of our PersonaMath models on MATH and GSM8K reveals that the PersonaMath-7B model (based on Qwen2.5-7B) achieves an accuracy of 61.2% on MATH and 87.8% on GSM8K, surpassing all baseline methods and achieving state-of-the-art performance. Notably, our dataset contains only 128.9K data points-merely 32.6% of MetaMathQA and 49.5% of MathInstruct-yet our model outperforms these baselines, demonstrating the high quality and diversity of our dataset, which enables more efficient model training. We open-source the PersonaMathQA dataset, PersonaMath models, and our code for public usage.
View on arXiv@article{luo2025_2410.01504, title={ PersonaMath: Boosting Mathematical Reasoning via Persona-Driven Data Augmentation }, author={ Jing Luo and Longze Chen and Run Luo and Liang Zhu and Chang Ao and Jiaming Li and Yukun Chen and Xin Cheng and Wen Yang and Jiayuan Su and Ahmadreza Argha and Hamid Alinejad-Rokny and Chengming Li and Shiwen Ni and Min Yang }, journal={arXiv preprint arXiv:2410.01504}, year={ 2025 } }