A Framework for Evaluation of Machine Reading Comprehension Gold
Standards

A Framework for Evaluation of Machine Reading Comprehension Gold Standards

10 March 2020

Viktor Schlegel

Marco Valentino

Riza Batista-Navarro

ArXiv (abs)PDF HTML

Papers citing "A Framework for Evaluation of Machine Reading Comprehension Gold Standards"

19 / 19 papers shown

Title
MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark Shengkun Ma Hao Peng Lei Hou Juanzi Li ELM 136 0 0 10 Mar 2025
Pay Attention to Real World Perturbations! Natural Robustness Evaluation in Machine Reading Comprehension Yulong Wu Viktor Schlegel Riza Batista-Navarro AAML 76 0 0 23 Feb 2025
Seemingly Plausible Distractors in Multi-Hop Reasoning: Are Large Language Models Attentive Readers? Neeladri Bhuiya Viktor Schlegel Stefan Winkler LRM 69 7 0 08 Sep 2024
Investigating a Benchmark for Training-set free Evaluation of Linguistic Capabilities in Machine Reading Comprehension Viktor Schlegel Goran Nenadic Riza Batista-Navarro ELM 56 0 0 09 Aug 2024
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets Seonghyeon Ye Doyoung Kim Sungdong Kim Hyeonbin Hwang Seungone Kim Yongrae Jo James Thorne Juho Kim Minjoon Seo ALM 134 108 0 20 Jul 2023
On Degrees of Freedom in Defining and Testing Natural Language Understanding Saku Sugawara S. Tsugita ELM 77 1 0 24 May 2023
It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance Arjun Subramonian Xingdi Yuan Hal Daumé Su Lin Blodgett 93 18 0 15 May 2023
Can Transformers Reason in Fragments of Natural Language? Viktor Schlegel Kamen V. Pavlov Ian Pratt-Hartmann LRM ReLM 77 7 0 10 Nov 2022
Machine Reading, Fast and Slow: When Do Models "Understand" Language? Sagnik Ray Choudhury Anna Rogers Isabelle Augenstein LRM 67 18 0 15 Sep 2022
A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine Reading Comprehension Xanh Ho Johannes Mario Meissner Saku Sugawara Akiko Aizawa OffRL 92 4 0 05 Sep 2022
WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language Federico Tavella Viktor Schlegel Marta Romeo Aphrodite Galata Angelo Cangelosi 88 10 0 11 Mar 2022
Feeding What You Need by Understanding What You Learned Xiaoqiang Wang Bang Liu Fangli Xu Bowei Long Siliang Tang Lingfei Wu 81 6 0 05 Mar 2022
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension Anna Rogers Matt Gardner Isabelle Augenstein 135 168 0 27 Jul 2021
Comparing Test Sets with Item Response Theory Clara Vania Phu Mon Htut William Huang Dhara Mungra Richard Yuanzhe Pang Jason Phang Haokun Liu Kyunghyun Cho Sam Bowman 74 43 0 01 Jun 2021
Do Natural Language Explanations Represent Valid Logical Arguments? Verifying Entailment in Explainable NLI Gold Standards Marco Valentino Ian Pratt-Hartman André Freitas XAI LRM 80 11 0 05 May 2021
Semantics Altering Modifications for Evaluating Comprehension in Machine Reading Viktor Schlegel Goran Nenadic Riza Batista-Navarro 71 18 0 07 Dec 2020
A Survey on Explainability in Machine Reading Comprehension Mokanarangan Thayaparan Marco Valentino André Freitas FaML 108 49 0 01 Oct 2020
Beyond Leaderboards: A survey of methods for revealing weaknesses in Natural Language Inference data and models Viktor Schlegel Goran Nenadic Riza Batista-Navarro ELM 84 18 0 29 May 2020
Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond Zhuosheng Zhang Hai Zhao Rui Wang 115 63 0 13 May 2020