ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.11249
  4. Cited By
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
v1v2v3 (latest)

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
22 June 2022
Sebastian Gehrmann
Abhik Bhattacharjee
Abinaya Mahendiran
Alex Jinpeng Wang
Alexandros Papangelis
Aman Madaan
Angelina McMillan-Major
Anna Shvets
Ashish Upadhyay
Bingsheng Yao
Bryan Wilie
Chandra Bhagavatula
Chaobin You
Craig Thomson
Cristina Garbacea
Dakuo Wang
Daniel Deutsch
Deyi Xiong
Di Jin
Dimitra Gkatzia
Dragomir R. Radev
Elizabeth Clark
Esin Durmus
Faisal Ladhak
Filip Ginter
Genta Indra Winata
Hendrik Strobelt
Hiroaki Hayashi
Jekaterina Novikova
Jenna Kanerva
Jenny Chim
Jiawei Zhou
Jordan Clive
Joshua Maynez
João Sedoc
Juraj Juraska
Kaustubh D. Dhole
Khyathi Chandu
Laura Perez-Beltrachini
Leonardo F. R. Ribeiro
Lewis Tunstall
Li Zhang
Mahima Pushkarna
Mathias Creutz
Michael White
Mihir Kale
Moussa Kamal Eddine
Nico Daheim
Nishant Subramani
Ondrej Dusek
Paul Pu Liang
Pawan Sasanka Ammanamanchi
Qinqin Zhu
Ratish Puduppully
Reno Kriz
Rifat Shahriyar
Ronald Cardenas
Saad Mahamood
Salomey Osei
Samuel Cahyawijaya
S. vStajner
Sébastien Montella
Shailza
Shailza Jolly
Simon Mille
Tahmid Hasan
Shangda Wu
Tosin Adewumi
Vikas Raunak
Vipul Raheja
Vitaly Nikolaev
V. Tsai
Yacine Jernite
Yi Xu
Yisi Sang
Yixin Liu
Yufang Hou
ArXiv (abs)PDFHTMLGithub (16★)

Papers citing "GEMv2: Multilingual NLG Benchmarking in a Single Line of Code"

35 / 35 papers shown
Survey of NLU Benchmarks Diagnosing Linguistic Phenomena: Why not Standardize Diagnostics Benchmarks?
Survey of NLU Benchmarks Diagnosing Linguistic Phenomena: Why not Standardize Diagnostics Benchmarks?
Khloud Al Jallad
Nada Ghneim
Ghaida Rebdawi
LM&MAELM
288
0
0
27 Jul 2025
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Improving Model Evaluation using SMART Filtering of Benchmark DatasetsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
798
17
0
26 Oct 2024
LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks
  in English
LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English
T. Y. S. S. Santosh
Cornelius Weiss
Matthias Grabmair
AILawELM
563
14
0
12 Oct 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Orevaoghene Ahia
Shuyue Stella Li
Vidhisha Balachandran
Sunayana Sitaram
Yulia Tsvetkov
668
13
0
22 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELMALMLM&MA
551
80
0
09 Jun 2024
Prometheus 2: An Open Source Language Model Specialized in Evaluating
  Other Language Models
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Seungone Kim
Juyoung Suk
Shayne Longpre
Bill Yuchen Lin
Jamin Shin
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
MoMeALMELM
470
391
0
02 May 2024
InspectorRAGet: An Introspection Platform for RAG Evaluation
InspectorRAGet: An Introspection Platform for RAG Evaluation
Kshitij P. Fadnis
Siva Sankalp Patel
O. Boni
Yannis Katsis
Sara Rosenthal
Benjamin Sznajder
Marina Danilevsky
170
6
0
26 Apr 2024
Understanding Cross-Lingual Alignment -- A Survey
Understanding Cross-Lingual Alignment -- A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Katharina Hämmerl
Jindvrich Libovický
Kangyang Luo
369
37
0
09 Apr 2024
Aya Dataset: An Open-Access Collection for Multilingual Instruction
  Tuning
Aya Dataset: An Open-Access Collection for Multilingual Instruction TuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Shivalika Singh
Freddie Vargus
Daniel D'souza
Börje F. Karlsson
Abinaya Mahendiran
...
Max Bartolo
Julia Kreutzer
Ahmet Üstün
Marzieh Fadaee
Sara Hooker
437
195
0
09 Feb 2024
Cheetah: Natural Language Generation for 517 African Languages
Cheetah: Natural Language Generation for 517 African LanguagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ife Adebara
AbdelRahim Elmadany
Muhammad Abdul-Mageed
408
15
0
02 Jan 2024
Evaluating General-Purpose AI with Psychometrics
Evaluating General-Purpose AI with Psychometrics
Xiting Wang
Liming Jiang
Jose Hernandez-Orallo
David Stillwell
Luning Sun
Fang Luo
Xing Xie
AI4MHELM
281
24
0
25 Oct 2023
Which Prompts Make The Difference? Data Prioritization For Efficient
  Human LLM Evaluation
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
M. Boubdir
Edward Kim
Beyza Ermis
Marzieh Fadaee
Sara Hooker
ALM
359
22
0
22 Oct 2023
NusaWrites: Constructing High-Quality Corpora for Underrepresented and
  Extremely Low-Resource Languages
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource LanguagesInternational Joint Conference on Natural Language Processing (IJCNLP), 2023
Samuel Cahyawijaya
Holy Lovenia
Fajri Koto
Dea Adhista
Emmanuel Dave
...
Genta Indra Winata
David Moeljadi
Alham Fikri Aji
Ayu Purwarianti
Pascale Fung
355
18
0
19 Sep 2023
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG
Dolphin: A Challenging and Diverse Benchmark for Arabic NLGConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Ahmed Oumar El-Shangiti
Muhammad Abdul-Mageed
LM&MA
377
29
0
24 May 2023
InstructAlign: High-and-Low Resource Language Alignment via Continual
  Crosslingual Instruction Tuning
InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning
Samuel Cahyawijaya
Holy Lovenia
Tiezheng Yu
Willy Chung
Pascale Fung
ALM
272
22
0
23 May 2023
Cross-Lingual Supervision improves Large Language Models Pre-training
Cross-Lingual Supervision improves Large Language Models Pre-training
Andrea Schioppa
Xavier Garcia
Orhan Firat
LRM
229
13
0
19 May 2023
Towards More Robust NLP System Evaluation: Handling Missing Scores in
  Benchmarks
Towards More Robust NLP System Evaluation: Handling Missing Scores in BenchmarksConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Anas Himmi
Ekhine Irurozki
Nathan Noiry
Nathan Huet
Pierre Colombo
396
13
0
17 May 2023
A Systematic Study of Knowledge Distillation for Natural Language
  Generation with Pseudo-Target Training
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target TrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
362
24
0
03 May 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural
  Language Generation
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes
Aman Madaan
Emmy Liu
António Farinhas
Pedro Henrique Martins
...
José G. C. de Souza
Shuyan Zhou
Tongshuang Wu
Graham Neubig
Marcely Zanon Boito
ALM
385
72
0
01 May 2023
Evaluation for Change
Evaluation for ChangeAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Rishi Bommasani
ELM
282
0
0
20 Dec 2022
Evaluating Human-Language Model Interaction
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Abigail Z. Jacobs
LM&MAALM
439
121
0
19 Dec 2022
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
NusaCrowd: Open Source Initiative for Indonesian NLP ResourcesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Samuel Cahyawijaya
Holy Lovenia
Alham Fikri Aji
Genta Indra Winata
Bryan Wilie
...
Timothy Baldwin
Sebastian Ruder
Herry Sujaini
S. Sakti
Ayu Purwarianti
555
72
0
19 Dec 2022
CiteBench: A benchmark for Scientific Citation Text Generation
CiteBench: A benchmark for Scientific Citation Text GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Martin Funkquist
Ilia Kuznetsov
Yufang Hou
Iryna Gurevych
322
26
0
19 Dec 2022
Revisiting the Gold Standard: Grounding Summarization Evaluation with
  Robust Human Evaluation
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Yixin Liu
Alexander R. Fabbri
Pengfei Liu
Yilun Zhao
Linyong Nan
...
Simeng Han
Shafiq Joty
Chien-Sheng Wu
Caiming Xiong
Dragomir R. Radev
ALM
402
164
0
15 Dec 2022
A Major Obstacle for NLP Research: Let's Talk about Time Allocation!
A Major Obstacle for NLP Research: Let's Talk about Time Allocation!Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Katharina Kann
Shiran Dudy
Arya D. McCarthy
283
2
0
30 Nov 2022
Measuring the Measuring Tools: An Automatic Evaluation of Semantic
  Metrics for Text Corpora
Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text CorporaIEEE Games Entertainment Media Conference (GEM), 2022
George Kour
Samuel Ackerman
Orna Raz
E. Farchi
Boaz Carmeli
Ateret Anaby-Tavor
215
14
0
29 Nov 2022
Operationalizing Specifications, In Addition to Test Sets for Evaluating
  Constrained Generative Models
Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models
Vikas Raunak
Matt Post
Arul Menezes
EGVM
288
1
0
19 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
1.0K
2,879
0
09 Nov 2022
CLSE: Corpus of Linguistically Significant Entities
CLSE: Corpus of Linguistically Significant EntitiesIEEE Games Entertainment Media Conference (GEM), 2022
A. Chuklin
Justin Zhao
Mihir Kale
329
3
0
04 Nov 2022
Finding Memo: Extractive Memorization in Constrained Sequence Generation
  Tasks
Finding Memo: Extractive Memorization in Constrained Sequence Generation TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Vikas Raunak
Arul Menezes
193
16
0
24 Oct 2022
BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset
BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset
Ajwad Akil
Najrin Sultana
Abhik Bhattacharjee
Rifat Shahriyar
292
25
0
11 Oct 2022
Petals: Collaborative Inference and Fine-tuning of Large Models
Petals: Collaborative Inference and Fine-tuning of Large ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Alexander Borzunov
Dmitry Baranchuk
Tim Dettmers
Max Ryabinin
Younes Belkada
Artem Chumachenko
Pavel Samygin
Colin Raffel
VLM
277
110
0
02 Sep 2022
RealTime QA: What's the Answer Right Now?
RealTime QA: What's the Answer Right Now?Neural Information Processing Systems (NeurIPS), 2022
Jungo Kasai
Keisuke Sakaguchi
Yoichi Takahashi
Ronan Le Bras
Akari Asai
Xinyan Velocity Yu
Dragomir R. Radev
Noah A. Smith
Yejin Choi
Kentaro Inui
KELM
531
277
0
27 Jul 2022
NL-Augmenter: A Framework for Task-Sensitive Natural Language
  Augmentation
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
J. Zhang
Yue Zhang
491
99
0
06 Dec 2021
Control Prefixes for Parameter-Efficient Text Generation
Control Prefixes for Parameter-Efficient Text Generation
Jordan Clive
Kris Cao
Marek Rei
330
36
0
15 Oct 2021
1
Page 1 of 1