Title |
---|
![]() CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring
the (Lack of) Cultural Knowledge of LLMs Yu Ying Chiu Liwei Jiang Bill Yuchen Lin Chan Young Park Shuyue Stella Li ...Mehar Bhatia Maria Antoniak Yulia Tsvetkov Vered Shwartz Yejin Choi |
![]() The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models Seungone Kim Juyoung Suk Ji Yong Cho Shayne Longpre Chaeeun Kim ...Sean Welleck Graham Neubig Moontae Lee Kyungjae Lee Minjoon Seo |