Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.21934
Cited By
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
27 March 2025
Ivo Petrov
Jasper Dekoninck
Lyuben Baltadzhiev
Maria Drencheva
Kristian Minchev
Mislav Balunović
Nikola Jovanović
Martin Vechev
LRM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad"
6 / 6 papers shown
Title
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning
Liam Boyle
Nicolas Baumann
Paviththiren Sivasothilingam
Michele Magno
Luca Benini
LM&Ro
LRM
37
0
0
06 May 2025
Phi-4-reasoning Technical Report
Marah Abdin
Sahaj Agarwal
Ahmed Hassan Awadallah
Vidhisha Balachandran
Harkirat Singh Behl
...
Vaishnavi Shrivastava
Vibhav Vineet
Yue Wu
Safoora Yousefi
Guoqing Zheng
ReLM
LRM
77
0
0
30 Apr 2025
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
Shi Qiu
Shaoyang Guo
Zhuo-Yang Song
Y. Sun
Zeyu Cai
...
Changkun Shao
Qing-Hong Cao
Ming-xing Luo
Muhan Zhang
Hua Xing Zhu
AIMat
LRM
24
0
0
22 Apr 2025
AGI Is Coming... Right After AI Learns to Play Wordle
Sarath Shekkizhar
Romain Cosentino
LLMAG
35
0
0
21 Apr 2025
Has the Creativity of Large-Language Models peaked? An analysis of inter- and intra-LLM variability
Jennifer Haase
P. Hanel
Sebastian Pokutta
ALM
LRM
60
0
0
10 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLM
ALM
LRM
66
4
0
09 Apr 2025
1