A Survey of Parameters Associated with the Quality of Benchmarks in NLP

14 October 2022

Papers citing "A Survey of Parameters Associated with the Quality of Benchmarks in NLP"

8 / 8 papers shown

Title
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering Pan Lu Swaroop Mishra Tony Xia Liang Qiu Kai-Wei Chang Song-Chun Zhu Oyvind Tafjord Peter Clark A. Kalyan ELM ReLM LRM 209 1,101 0 20 Sep 2022
Competency Problems: On Finding and Removing Artifacts in Language Data Matt Gardner William Merrill Jesse Dodge Matthew E. Peters Alexis Ross Sameer Singh Noah A. Smith 161 107 0 17 Apr 2021
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets Mor Geva Yoav Goldberg Jonathan Berant 235 319 0 21 Aug 2019
Language GANs Falling Short Massimo Caccia Lucas Page-Caccia W. Fedus Hugo Larochelle Joelle Pineau Laurent Charlin 117 214 0 06 Nov 2018
Hypothesis Only Baselines in Natural Language Inference Adam Poliak Jason Naradowsky Aparajita Haldar Rachel Rudinger Benjamin Van Durme 187 576 0 02 May 2018
Split and Rephrase: Better Evaluation and a Stronger Baseline Roee Aharoni Yoav Goldberg MoE 215 45 0 02 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,943 0 20 Apr 2018
Teaching Machines to Read and Comprehend Karl Moritz Hermann Tomás Kociský Edward Grefenstette L. Espeholt W. Kay Mustafa Suleyman Phil Blunsom 170 3,508 0 10 Jun 2015