Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal VerificationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent FrameworkAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Hierarchical Attention Generates Better ProofsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Rosetta-PL: Propositional Logic as a Benchmark for Large Language Model ReasoningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025 |
Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem ProvingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support SystemsConference on Empirical Methods in Natural Language Processing (EMNLP), 2025 |