Title |
---|
![]() LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs Yujun Zhou Jingdong Yang Kehan Guo Pin-Yu Chen Tian Gao ...Tian Gao Werner Geyer Nuno Moniz Nitesh V Chawla Xiangliang Zhang |
![]() The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models Seungone Kim Juyoung Suk Ji Yong Cho Shayne Longpre Chaeeun Kim ...Sean Welleck Graham Neubig Moontae Lee Kyungjae Lee Minjoon Seo |