ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.11249
  4. Cited By
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

22 June 2022
Sebastian Gehrmann
Abhik Bhattacharjee
Abinaya Mahendiran
Alex Jinpeng Wang
Alexandros Papangelis
Aman Madaan
Angelina McMillan-Major
Anna Shvets
Ashish Upadhyay
Bingsheng Yao
Bryan Wilie
Chandra Bhagavatula
Chaobin You
Craig Thomson
Cristina Garbacea
Dakuo Wang
Daniel Deutsch
Deyi Xiong
Di Jin
Dimitra Gkatzia
Dragomir R. Radev
Elizabeth Clark
Esin Durmus
Faisal Ladhak
Filip Ginter
Genta Indra Winata
Hendrik Strobelt
Hiroaki Hayashi
Jekaterina Novikova
Jenna Kanerva
Jenny Chim
Jiawei Zhou
Jordan Clive
Joshua Maynez
João Sedoc
Juraj Juraska
Kaustubh D. Dhole
Khyathi Raghavi Chandu
Laura Perez-Beltrachini
Leonardo F. R. Ribeiro
Lewis Tunstall
Li Zhang
Mahima Pushkarna
Mathias Creutz
Michael White
Mihir Kale
Moussa Kamal Eddine
Nico Daheim
Nishant Subramani
Ondrej Dusek
Paul Pu Liang
Pawan Sasanka Ammanamanchi
Qinqin Zhu
Ratish Puduppully
Reno Kriz
Rifat Shahriyar
Ronald Cardenas
Saad Mahamood
Salomey Osei
Samuel Cahyawijaya
S. vStajner
Sébastien Montella
Shailza
Shailza Jolly
Simon Mille
Tahmid Hasan
Tianhao Shen
Tosin P. Adewumi
Vikas Raunak
Vipul Raheja
Vitaly Nikolaev
V. Tsai
Yacine Jernite
Yi Xu
Yisi Sang
Yixin Liu
Yufang Hou
ArXivPDFHTML

Papers citing "GEMv2: Multilingual NLG Benchmarking in a Single Line of Code"

19 / 19 papers shown
Title
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
52
1
0
26 Oct 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Orevaoghene Ahia
Shuyue Stella Li
Vidhisha Balachandran
Sunayana Sitaram
Yulia Tsvetkov
65
4
0
22 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
97
29
0
09 Jun 2024
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Ahmed Oumar El-Shangiti
Muhammad Abdul-Mageed
LM&MA
30
17
0
24 May 2023
Towards More Robust NLP System Evaluation: Handling Missing Scores in
  Benchmarks
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
Anas Himmi
Ekhine Irurozki
Nathan Noiry
Stéphan Clémençon
Pierre Colombo
19
5
0
17 May 2023
Evaluation for Change
Evaluation for Change
Rishi Bommasani
ELM
24
0
0
20 Dec 2022
Evaluating Human-Language Model Interaction
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
46
99
0
19 Dec 2022
Measuring the Measuring Tools: An Automatic Evaluation of Semantic
  Metrics for Text Corpora
Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora
George Kour
Samuel Ackerman
Orna Raz
E. Farchi
Boaz Carmeli
Ateret Anaby-Tavor
28
10
0
29 Nov 2022
Operationalizing Specifications, In Addition to Test Sets for Evaluating
  Constrained Generative Models
Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models
Vikas Raunak
Matt Post
Arul Menezes
EGVM
27
0
0
19 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
68
2,301
0
09 Nov 2022
Finding Memo: Extractive Memorization in Constrained Sequence Generation
  Tasks
Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks
Vikas Raunak
Arul Menezes
30
13
0
24 Oct 2022
SQuALITY: Building a Long-Document Summarization Dataset the Hard Way
SQuALITY: Building a Long-Document Summarization Dataset the Hard Way
Alex Jinpeng Wang
Richard Yuanzhe Pang
Angelica Chen
Jason Phang
Samuel R. Bowman
72
44
0
23 May 2022
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Lavinia Dunagan
Jacob Morrison
Alexander R. Fabbri
Yejin Choi
Noah A. Smith
49
39
0
08 Dec 2021
NL-Augmenter: A Framework for Task-Sensitive Natural Language
  Augmentation
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
Jing Zhang
Yue Zhang
169
86
0
06 Dec 2021
BiSECT: Learning to Split and Rephrase Sentences with Bitexts
BiSECT: Learning to Split and Rephrase Sentences with Bitexts
Joongwon Kim
Mounica Maddela
Reno Kriz
Wei-ping Xu
Chris Callison-Burch
54
25
0
10 Sep 2021
Data-to-text Generation with Macro Planning
Data-to-text Generation with Macro Planning
Ratish Puduppully
Mirella Lapata
53
73
0
04 Feb 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
246
283
0
02 Feb 2021
BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
Moussa Kamal Eddine
A. Tixier
Michalis Vazirgiannis
BDL
101
64
0
23 Oct 2020
RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich
  Semantic Annotations for Task-Oriented Dialogue Modeling
RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling
Jun Quan
Shian Zhang
Qian Cao
Zi-pu Li
Deyi Xiong
35
51
0
17 Oct 2020
1