Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.01790
Cited By
Training on the Benchmark Is Not All You Need
3 September 2024
Shiwen Ni
Xiangtao Kong
Chengming Li
Xiping Hu
Ruifeng Xu
Jia Zhu
Min Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training on the Benchmark Is Not All You Need"
4 / 4 papers shown
Title
AnesBench: Multi-Dimensional Evaluation of LLM Reasoning in Anesthesiology
Xiang Feng
Wentao Jiang
Zengmao Wang
Yong Luo
Pingbo Xu
Baosheng Yu
Hua Jin
Bo Du
Jing Zhang
ELM
LRM
38
0
0
03 Apr 2025
The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination
Yifan Sun
Han Wang
Dongbai Li
Gang Wang
Huan Zhang
AAML
48
0
0
20 Mar 2025
Unbiased Evaluation of Large Language Models from a Causal Perspective
Meilin Chen
Jian Tian
Liang Ma
Di Xie
Weijie Chen
Jiang Zhu
ALM
ELM
49
0
0
10 Feb 2025
L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?
Zecheng Tang
Keyan Zhou
Juntao Li
Baibei Ji
Jianye Hou
Min Zhang
33
1
0
03 Oct 2024
1