Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00061
Cited By
All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text
30 June 2021
Elizabeth Clark
Tal August
Sofia Serrano
Nikita Haduong
Suchin Gururangan
Noah A. Smith
DeLMO
Re-assign community
ArXiv
PDF
HTML
Papers citing
"All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text"
50 / 220 papers shown
Title
Language Model Behavior: A Comprehensive Survey
Tyler A. Chang
Benjamin Bergen
VLM
LRM
LM&MA
27
102
0
20 Mar 2023
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models
Yiran Ye
Thai Le
Dongwon Lee
AAML
DeLMO
25
3
0
18 Mar 2023
Mapping the Design Space of Interactions in Human-AI Text Co-creation Tasks
Zijian Ding
Joel Chan
22
18
0
11 Mar 2023
Fluid Transformers and Creative Analogies: Exploring Large Language Models' Capacity for Augmenting Cross-Domain Analogical Creativity
Zijian Ding
Arvind Srinivasan
Stephen MacNeil
Joel Chan
26
35
0
27 Feb 2023
The Science of Detecting LLM-Generated Texts
Ruixiang Tang
Yu-Neng Chuang
Xia Hu
DeLMO
31
167
0
04 Feb 2023
Creating a Large Language Model of a Philosopher
Eric Schwitzgebel
David Schwitzgebel
A. Strasser
DeLMO
AI4CE
19
59
0
02 Feb 2023
LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization
Kalpesh Krishna
Erin Bransom
Bailey Kuehl
Mohit Iyyer
Pradeep Dasigi
Arman Cohan
Kyle Lo
14
89
0
30 Jan 2023
AI vs. Human -- Differentiation Analysis of Scientific Content Generation
Yongqiang Ma
Jiawei Liu
Fan Yi
Qikai Cheng
Yong Huang
Wei Lu
Xiaozhong Liu
DeLMO
4
56
0
24 Jan 2023
The Next Chapter: A Study of Large Language Models in Storytelling
Zhuohan Xie
Trevor Cohn
Jey Han Lau
28
42
0
24 Jan 2023
AI model GPT-3 (dis)informs us better than humans
Giovanni Spitale
Nikola Biller-Andorno
Federico Germani
DeLMO
11
147
0
23 Jan 2023
MAUVE Scores for Generative Models: Theory and Practice
Krishna Pillutla
Lang Liu
John Thickstun
Sean Welleck
Swabha Swayamdipta
Rowan Zellers
Sewoong Oh
Yejin Choi
Zaïd Harchaoui
EGVM
23
21
0
30 Dec 2022
Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text
Liam Dugan
Daphne Ippolito
Arun Kirubarajan
Sherry Shi
Chris Callison-Burch
DeLMO
27
62
0
24 Dec 2022
IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
Ananya B. Sai
Vignesh Nagarajan
Tanay Dixit
Raj Dabre
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
39
21
0
20 Dec 2022
Foundation models in brief: A historical, socio-technical focus
Johannes Schneider
VLM
21
9
0
17 Dec 2022
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Yixin Liu
Alexander R. Fabbri
Pengfei Liu
Yilun Zhao
Linyong Nan
...
Simeng Han
Shafiq R. Joty
Chien-Sheng Wu
Caiming Xiong
Dragomir R. Radev
ALM
10
132
0
15 Dec 2022
Evaluation of Synthetic Datasets for Conversational Recommender Systems
Harsh Lara
Manoj Kumar Tiwari
SyDa
18
6
0
12 Dec 2022
Economic Systems in Metaverse: Basics, State of the Art, and Challenges
Huawei Huang
Qinnan Zhang
Taotao Li
Qinglin Yang
Zhaokang Yin
Junhao Wu
Zehui Xiong
Jianming Zhu
Jiajing Wu
Zibin Zheng
AILaw
32
27
0
12 Dec 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Maarten Sap
Ronan Le Bras
Daniel Fried
Yejin Choi
19
205
0
24 Oct 2022
Mutual Information Alleviates Hallucinations in Abstractive Summarization
Liam van der Poel
Ryan Cotterell
Clara Meister
HILM
6
56
0
24 Oct 2022
On the Effectiveness of Automated Metrics for Text Generation Systems
Pius von Daniken
Jan Deriu
Don Tuggener
Mark Cieliebak
10
3
0
24 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
19
40
0
19 Oct 2022
Controllable Fake Document Infilling for Cyber Deception
Yibo Hu
Yu Lin
Eric Parolin
Latif Khan
Kevin W. Hamlen
16
8
0
18 Oct 2022
Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches and Future Directions
Qi Jia
Yizhu Liu
Siyu Ren
Kenny Q. Zhu
24
6
0
18 Oct 2022
StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning
Hong Chen
D. Vo
Hiroya Takamura
Yusuke Miyao
Hideki Nakayama
17
20
0
16 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
25
107
0
13 Oct 2022
How Large Language Models are Transforming Machine-Paraphrased Plagiarism
Jan Philip Wahle
Terry Ruas
Frederic Kirstein
Bela Gipp
11
31
0
07 Oct 2022
News Summarization and Evaluation in the Era of GPT-3
Tanya Goyal
Junyi Jessy Li
Greg Durrett
ELM
26
385
0
26 Sep 2022
Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems
Sahar Abdelnabi
Mario Fritz
AAML
192
5
0
07 Sep 2022
SynSciPass: detecting appropriate uses of scientific text generation
Domenic Rosati
DeLMO
51
17
0
07 Sep 2022
Deception for Cyber Defence: Challenges and Opportunities
David Liebowitz
Surya Nepal
Kristen Moore
Cody James Christopher
S. Kanhere
David D. Nguyen
Roelien C. Timmer
Michael Longland
Keerth Rathakumar
29
10
0
15 Aug 2022
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia
Steven M. Goodman
Erin Buehler
Patrick Clary
Andy Coenen
Aaron Donsbach
...
Lei Shi
Rachel Sweeney
Phil Weaver
Ann Yuan
Meredith Ringel Morris
14
57
0
05 Jul 2022
Human heuristics for AI-generated language are flawed
Maurice Jakesch
Jeffrey T. Hancock
Mor Naaman
DeLMO
14
177
0
15 Jun 2022
Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian
T. Shamardina
Vladislav Mikhailov
Daniil Chernianskii
Alena Fenogenova
Marat Saidov
A. Valeeva
Tatiana Shavrina
I. Smurov
E. Tutubalina
Ekaterina Artemova
DeLMO
16
30
0
03 Jun 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
47
206
0
26 May 2022
The Authenticity Gap in Human Evaluation
Kawin Ethayarajh
Dan Jurafsky
79
24
0
24 May 2022
Lack of Fluency is Hurting Your Translation Model
J. Yoo
Jaewoo Kang
13
0
0
24 May 2022
RankGen: Improving Text Generation with Large Ranking Models
Kalpesh Krishna
Yapei Chang
John Wieting
Mohit Iyyer
AIMat
16
68
0
19 May 2022
SNaC: Coherence Error Detection for Narrative Summarization
Tanya Goyal
Junyi Jessy Li
Greg Durrett
24
27
0
19 May 2022
Twist Decoding: Diverse Generators Guide Each Other
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Hao Peng
Ximing Lu
Dragomir R. Radev
Yejin Choi
Noah A. Smith
SyDa
19
4
0
19 May 2022
Deconstructing NLG Evaluation: Evaluation Practices, Assumptions, and Their Implications
Kaitlyn Zhou
Su Lin Blodgett
Adam Trischler
Hal Daumé
Kaheer Suleman
Alexandra Olteanu
ELM
94
26
0
13 May 2022
When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it
Sebastian Schuster
Tal Linzen
11
25
0
06 May 2022
An End-to-End Dialogue Summarization System for Sales Calls
Abedelkadir Asi
Song Wang
Roy Eisenstadt
Dean Geckt
Yarin Kuper
Yi Mao
Royi Ronen
20
16
0
27 Apr 2022
Event Transition Planning for Open-ended Text Generation
Qintong Li
Pijian Li
Wei Bi
Z. Ren
Yuxuan Lai
Lingpeng Kong
15
12
0
20 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
24
6
0
11 Apr 2022
On the probability-quality paradox in language generation
Clara Meister
Gian Wiher
Tiago Pimentel
Ryan Cotterell
28
14
0
31 Mar 2022
On Decoding Strategies for Neural Text Generators
Gian Wiher
Clara Meister
Ryan Cotterell
9
64
0
29 Mar 2022
Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges
Shikib Mehri
Jinho Choi
L. F. D’Haro
Jan Deriu
M. Eskénazi
...
David Traum
Yi-Ting Yeh
Zhou Yu
Yizhe Zhang
Chen Zhang
28
21
0
18 Mar 2022
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
Thomas Hartvigsen
Saadia Gabriel
Hamid Palangi
Maarten Sap
Dipankar Ray
Ece Kamar
11
344
0
17 Mar 2022
Do Language Models Plagiarize?
Jooyoung Lee
Thai Le
Jinghui Chen
Dongwon Lee
20
73
0
15 Mar 2022
Probing BERT's priors with serial reproduction chains
Takateru Yamakoshi
Thomas L. Griffiths
Robert D. Hawkins
18
12
0
24 Feb 2022
Previous
1
2
3
4
5
Next