ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.21934
  4. Cited By
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad

Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad

27 March 2025
Ivo Petrov
Jasper Dekoninck
Lyuben Baltadzhiev
Maria Drencheva
Kristian Minchev
Mislav Balunović
Nikola Jovanović
Martin Vechev
    LRM
    ELM
ArXivPDFHTML

Papers citing "Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad"

6 / 6 papers shown
Title
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning
Liam Boyle
Nicolas Baumann
Paviththiren Sivasothilingam
Michele Magno
Luca Benini
LM&Ro
LRM
37
0
0
06 May 2025
Phi-4-reasoning Technical Report
Phi-4-reasoning Technical Report
Marah Abdin
Sahaj Agarwal
Ahmed Hassan Awadallah
Vidhisha Balachandran
Harkirat Singh Behl
...
Vaishnavi Shrivastava
Vibhav Vineet
Yue Wu
Safoora Yousefi
Guoqing Zheng
ReLM
LRM
77
0
0
30 Apr 2025
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
Shi Qiu
Shaoyang Guo
Zhuo-Yang Song
Y. Sun
Zeyu Cai
...
Changkun Shao
Qing-Hong Cao
Ming-xing Luo
Muhan Zhang
Hua Xing Zhu
AIMat
LRM
24
0
0
22 Apr 2025
AGI Is Coming... Right After AI Learns to Play Wordle
AGI Is Coming... Right After AI Learns to Play Wordle
Sarath Shekkizhar
Romain Cosentino
LLMAG
35
0
0
21 Apr 2025
Has the Creativity of Large-Language Models peaked? An analysis of inter- and intra-LLM variability
Has the Creativity of Large-Language Models peaked? An analysis of inter- and intra-LLM variability
Jennifer Haase
P. Hanel
Sebastian Pokutta
ALM
LRM
60
0
0
10 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLM
ALM
LRM
66
4
0
09 Apr 2025
1