ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.07566
  4. Cited By
A Survey of Parameters Associated with the Quality of Benchmarks in NLP

A Survey of Parameters Associated with the Quality of Benchmarks in NLP

14 October 2022
Swaroop Mishra
Anjana Arunkumar
Chris Bryan
Chitta Baral
ArXivPDFHTML

Papers citing "A Survey of Parameters Associated with the Quality of Benchmarks in NLP"

8 / 8 papers shown
Title
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
209
1,101
0
20 Sep 2022
Competency Problems: On Finding and Removing Artifacts in Language Data
Competency Problems: On Finding and Removing Artifacts in Language Data
Matt Gardner
William Merrill
Jesse Dodge
Matthew E. Peters
Alexis Ross
Sameer Singh
Noah A. Smith
161
107
0
17 Apr 2021
Are We Modeling the Task or the Annotator? An Investigation of Annotator
  Bias in Natural Language Understanding Datasets
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
Mor Geva
Yoav Goldberg
Jonathan Berant
235
319
0
21 Aug 2019
Language GANs Falling Short
Language GANs Falling Short
Massimo Caccia
Lucas Page-Caccia
W. Fedus
Hugo Larochelle
Joelle Pineau
Laurent Charlin
117
214
0
06 Nov 2018
Hypothesis Only Baselines in Natural Language Inference
Hypothesis Only Baselines in Natural Language Inference
Adam Poliak
Jason Naradowsky
Aparajita Haldar
Rachel Rudinger
Benjamin Van Durme
187
576
0
02 May 2018
Split and Rephrase: Better Evaluation and a Stronger Baseline
Split and Rephrase: Better Evaluation and a Stronger Baseline
Roee Aharoni
Yoav Goldberg
MoE
215
45
0
02 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Teaching Machines to Read and Comprehend
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
170
3,508
0
10 Jun 2015
1