Improving Model Evaluation using SMART Filtering of Benchmark DatasetsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 |
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Seungone Kim Juyoung Suk Ji Yong Cho Shayne Longpre Chaeeun Kim ...Sean Welleck Graham Neubig Moontae Lee Kyungjae Lee Minjoon Seo |
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
Understanding Cross-Lingual Alignment -- A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Aya Dataset: An Open-Access Collection for Multilingual Instruction
TuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Cheetah: Natural Language Generation for 517 African LanguagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
NusaWrites: Constructing High-Quality Corpora for Underrepresented and
Extremely Low-Resource LanguagesInternational Joint Conference on Natural Language Processing (IJCNLP), 2023 |
Dolphin: A Challenging and Diverse Benchmark for Arabic NLGConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
Towards More Robust NLP System Evaluation: Handling Missing Scores in
BenchmarksConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
A Systematic Study of Knowledge Distillation for Natural Language
Generation with Pseudo-Target TrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
Evaluation for ChangeAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
NusaCrowd: Open Source Initiative for Indonesian NLP ResourcesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
CiteBench: A benchmark for Scientific Citation Text GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Revisiting the Gold Standard: Grounding Summarization Evaluation with
Robust Human EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
A Major Obstacle for NLP Research: Let's Talk about Time Allocation!Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Measuring the Measuring Tools: An Automatic Evaluation of Semantic
Metrics for Text CorporaIEEE Games Entertainment Media Conference (GEM), 2022 |
CLSE: Corpus of Linguistically Significant EntitiesIEEE Games Entertainment Media Conference (GEM), 2022 |
Finding Memo: Extractive Memorization in Constrained Sequence Generation
TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Petals: Collaborative Inference and Fine-tuning of Large ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
RealTime QA: What's the Answer Right Now?Neural Information Processing Systems (NeurIPS), 2022 |