ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.08292
  4. Cited By
Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning in LLMs
v1v2 (latest)

Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning in LLMs

5 August 2025
Aryan Gulati
Brando Miranda
Eric Chen
Emily Xia
Kai Fronsdal
Bruno Dumont
Elyas Obbad
Sanmi Koyejo
    AIMatReLMLRM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning in LLMs"

4 / 4 papers shown
Title
Putnam-like dataset summary: LLMs as mathematical competition contestants
Putnam-like dataset summary: LLMs as mathematical competition contestants
Bartosz Bieganowski
Daniel Strzelecki
Robert Skiba
Mateusz Topolewski
AIMat
38
0
0
29 Sep 2025
Throttling Web Agents Using Reasoning Gates
A. Kumar
Jaechul Roh
A. Naseh
Amir Houmansadr
Eugene Bagdasarian
LRM
56
0
0
01 Sep 2025
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
Yuren Hao
Xiang Wan
Chengxiang Zhai
LRM
28
2
0
12 Aug 2025
Hide and Seek with LLMs: An Adversarial Game for Sneaky Error Generation and Self-Improving Diagnosis
Hide and Seek with LLMs: An Adversarial Game for Sneaky Error Generation and Self-Improving Diagnosis
Rui Zou
Mengqi Wei
Yutao Zhu
J. Wen
Xin Zhao
Jing Chen
LRM
34
0
0
05 Aug 2025
1