ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.06871
  4. Cited By
Near-Negative Distinction: Giving a Second Life to Human Evaluation
  Datasets

Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets

13 May 2022
Philippe Laban
Chien-Sheng Wu
Wenhao Liu
Caiming Xiong
ArXivPDFHTML

Papers citing "Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets"

7 / 7 papers shown
Title
The Jungle of Generative Drug Discovery: Traps, Treasures, and Ways Out
Rıza Özçelik
F. Grisoni
36
0
0
24 Dec 2024
LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond
LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond
Philippe Laban
Wojciech Kry'sciñski
Divyansh Agarwal
Alexander R. Fabbri
Caiming Xiong
Shafiq R. Joty
Chien-Sheng Wu
ALM
HILM
24
30
0
23 May 2023
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Lavinia Dunagan
Jacob Morrison
Alexander R. Fabbri
Yejin Choi
Noah A. Smith
49
39
0
08 Dec 2021
Training Dynamics for Text Summarization Models
Training Dynamics for Text Summarization Models
Tanya Goyal
Jiacheng Xu
J. Li
Greg Durrett
57
28
0
15 Oct 2021
Understanding Factuality in Abstractive Summarization with FRANK: A
  Benchmark for Factuality Metrics
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics
Artidoro Pagnoni
Vidhisha Balachandran
Yulia Tsvetkov
HILM
215
305
0
27 Apr 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
238
284
0
02 Feb 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1