Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2508.05469
Cited By
v1
v2 (latest)
Let's Measure Information Step-by-Step: LLM-Based Evaluation Beyond Vibes
7 August 2025
Zachary Robertson
Sanmi Koyejo
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Let's Measure Information Step-by-Step: LLM-Based Evaluation Beyond Vibes"
2 / 2 papers shown
Title
CLINB: A Climate Intelligence Benchmark for Foundational Models
Michelle Chen Huebscher
Katharine Mach
Aleksandar Stanić
Markus Leippold
Ben Gaiarin
...
Massimiliano Ciaramita
Joeri Rogelj
Christian Buck
Lierni Sestorain Saralegui
Reto Knutti
HILM
ELM
261
0
0
29 Oct 2025
Identity-Link IRT for Label-Free LLM Evaluation: Preserving Additivity in TVD-MI Scores
Zachary Robertson
92
0
0
16 Oct 2025
1