KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for CodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
SR: Teaching LLMs to Self-verify and Self-correct via Reinforcement LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Small Models Struggle to Learn from Strong ReasonersAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |