Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.09710
Cited By
The Human Evaluation Datasheet 1.0: A Template for Recording Details of Human Evaluation Experiments in NLP
17 March 2021
Anastasia Shimorina
Anya Belz
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Human Evaluation Datasheet 1.0: A Template for Recording Details of Human Evaluation Experiments in NLP"
7 / 7 papers shown
Title
Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
Anya Belz
Craig Thomson
Ehud Reiter
Gavin Abercrombie
J. Alonso-Moral
...
Antonio Toral
Xiao-Yi Wan
Leo Wanner
Lewis J. Watson
Diyi Yang
66
35
0
02 May 2023
Evaluating NLG systems: A brief introduction
Emiel van Miltenburg
18
0
0
29 Mar 2023
MAUVE Scores for Generative Models: Theory and Practice
Krishna Pillutla
Lang Liu
John Thickstun
Sean Welleck
Swabha Swayamdipta
Rowan Zellers
Sewoong Oh
Yejin Choi
Zaïd Harchaoui
EGVM
23
21
0
30 Dec 2022
Quantified Reproducibility Assessment of NLP Results
Anya Belz
Maja Popović
Simon Mille
17
28
0
12 Apr 2022
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
246
283
0
02 Feb 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
26
341
0
02 Feb 2021
With Little Power Comes Great Responsibility
Dallas Card
Peter Henderson
Urvashi Khandelwal
Robin Jia
Kyle Mahowald
Dan Jurafsky
225
115
0
13 Oct 2020
1