Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2511.04703
Cited By
Measuring what Matters: Construct Validity in Large Language Model Benchmarks
3 November 2025
Andrew M. Bean
Ryan Kearns
Angelika Romanou
Franziska Sofia Hafner
Harry Mayne
Jan Batzner
Negar Foroutan
Chris Schmitz
Karolina Korgul
Hunar Batra
Oishi Deb
Emma Beharry
Cornelius Emde
Thomas Foster
Anna Gausen
María Grandury
Simeng Han
Valentin Hofmann
Lujain Ibrahim
Hazel Kim
Hannah Rose Kirk
Fangru Lin
Gabrielle Kaili-May Liu
Lennart Luettgau
Jabez Magomere
Jonathan Rystrøm
Anna Sotnikova
Yushi Yang
Yilun Zhao
Adel Bibi
Antoine Bosselut
Ronald Clark
Arman Cohan
Jakob N. Foerster
Y. Gal
Scott A. Hale
Inioluwa Deborah Raji
Christopher Summerfield
Philip Torr
Cozmin Ududec
Luc Rocher
Adam Mahdi
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"Measuring what Matters: Construct Validity in Large Language Model Benchmarks"
0 / 0 papers shown
Title
No papers found