Fluid Language Model Benchmarking

14 September 2025

Papers citing "Fluid Language Model Benchmarking"

5 / 5 papers shown

Title
Cognitive Foundations for Reasoning and Their Manifestation in LLMs Priyanka Kargupta Shuyue Stella Li Haocheng Wang Jinu Lee Shan Chen ... Thomas L. Griffiths Max Kleiman-Weiner Jiawei Han Asli Celikyilmaz Yulia Tsvetkov LRM 190 2 0 20 Nov 2025
On the Measure of a Model: From Intelligence to Generality Ruchira Dhar Ninell Oldenburg Anders Soegaard ELM 125 0 0 14 Nov 2025
DISCO: Diversifying Sample Condensation for Efficient Model Evaluation Alexander Rubinstein Benjamin Raible Martin Gubri Seong Joon Oh ELM 359 0 1 09 Oct 2025
RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs Nigel Fernandez Branislav Kveton Ryan Rossi Andrew Lan Zichao Wang LRM 202 0 0 29 Sep 2025
JE-IRT: A Geometric Lens on LLM Abilities through Joint Embedding Item Response Theory Louie Hong Yao Nicholas Jarvis Tiffany Zhan Saptarshi Ghosh Linfeng Liu Tianyu Jiang 92 0 0 26 Sep 2025