ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19676
84
1
v1v2 (latest)

Large Language Models' Reasoning Stalls: An Investigation into the Capabilities of Frontier Models

26 May 2025
Lachlan McGinness
Peter Baumgartner
    ReLMLRMELM
ArXiv (abs)PDFHTML
Main:11 Pages
8 Figures
Bibliography:5 Pages
3 Tables
Appendix:6 Pages
Abstract

Empirical methods to examine the capability of Large Language Models (LLMs) to use Automated Theorem Prover (ATP) reasoning strategies are studied. We evaluate the performance of State of the Art models from December 2023 and August 2024 on PRONTOQA steamroller reasoning problems. For that, we develop methods for assessing LLM response accuracy and correct answer correlation.

View on arXiv
Comments on this paper