Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.00823
Cited By
WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting
1 May 2024
Olly Styles
Sam Miller
Patricio Cerda-Mardini
T. Guha
Victor Sanchez
Bertie Vidgen
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting"
1 / 1 papers shown
Title
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
0
0
26 Apr 2025
1