STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent FrameworkAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual EnvironmentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?Annual Meeting of the Association for Computational Linguistics (ACL), 2025 |