Putnam-like dataset summary: LLMs as mathematical competition contestants

29 September 2025

Bartosz Bieganowski

ArXiv (abs)PDF HTML Github (4★)

Main:10 Pages

12 Figures

Bibliography:2 Pages

4 Tables

Appendix:1 Pages

Abstract

In this paper we summarize the results of the Putnam-like benchmark published by Google DeepMind. This dataset consists of 96 original problems in the spirit of the Putnam Competition and 576 solutions of LLMs. We analyse the performance of models on this set of problems to verify their ability to solve problems from mathematical contests.

View on arXiv

Comments on this paper