Title
Putnam-like dataset summary: LLMs as mathematical competition contestants Bartosz Bieganowski Daniel Strzelecki Robert Skiba Mateusz Topolewski AIMat 38 0 0 29 Sep 2025
Throttling Web Agents Using Reasoning Gates A. Kumar Jaechul Roh A. Naseh Amir Houmansadr Eugene Bagdasarian LRM 56 0 0 01 Sep 2025
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems Yuren Hao Xiang Wan Chengxiang Zhai LRM 28 2 0 12 Aug 2025
Hide and Seek with LLMs: An Adversarial Game for Sneaky Error Generation and Self-Improving Diagnosis Rui Zou Mengqi Wei Yutao Zhu J. Wen Xin Zhao Jing Chen LRM 34 0 0 05 Aug 2025