Gemini 2.5 Pro Capable of Winning Gold at IMO 2025

21 July 2025

Yichen Huang

Lin F. Yang

ReLM

LRM

ArXiv (abs)PDF HTML Github (770★)

Main:16 Pages

1 Figures

Bibliography:1 Pages

Appendix:51 Pages

Abstract

The International Mathematical Olympiad (IMO) poses uniquely challenging problems requiring deep insight, creativity, and formal reasoning. While Large Language Models (LLMs) perform well on mathematical benchmarks like AIME, they struggle with Olympiad-level tasks. We use Google's Gemini 2.5 Pro on the newly released IMO 2025 problems, avoiding data contamination. Using a self-verification pipeline with careful prompt design, 5 (out of 6) problems are solved correctly. This result underscores the importance of developing optimal strategies to harness the full potential of powerful LLMs for complex reasoning tasks.

View on arXiv

Comments on this paper