Pedagogy-R1: Pedagogically-Aligned Reasoning Model with Balanced Educational Benchmark

24 May 2025

Main:10 Pages

5 Figures

Bibliography:5 Pages

4 Tables

Abstract

Recent advances in large reasoning models (LRMs) show strong performance in structured domains such as mathematics and programming; however, they often lack pedagogical coherence and realistic teaching behaviors. To bridge this gap, we introduce Pedagogy-R1, a framework that adapts LRMs for classroom use through three innovations: (1) a distillation-based pipeline that filters and refines model outputs for instruction-tuning, (2) the Well-balanced Educational Benchmark (WBEB), which evaluates performance across subject knowledge, pedagogical knowledge, tracing, essay scoring, and teacher decision-making, and (3) a Chain-of-Pedagogy (CoP) prompting strategy for generating and eliciting teacher-style reasoning. Our mixed-method evaluation combines quantitative metrics with qualitative analysis, providing the first systematic assessment of LRMs' pedagogical strengths and limitations.

View on arXiv

@article{lee2025_2505.18467,
  title={ Pedagogy-R1: Pedagogically-Aligned Reasoning Model with Balanced Educational Benchmark },
  author={ Unggi Lee and Jaeyong Lee and Jiyeong Bae and Yeil Jeong and Junbo Koh and Gyeonggeon Lee and Gunho Lee and Taekyung Ahn and Hyeoncheol Kim },
  journal={arXiv preprint arXiv:2505.18467},
  year={ 2025 }
}

Comments on this paper