Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
Closed-Source LLMsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024 |
Post Turing: Mapping the landscape of LLM EvaluationIEEE Games Entertainment Media Conference (IEEE GEM), 2023 |