Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.00293
Cited By
Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension
2 February 2020
Max Bartolo
A. Roberts
Johannes Welbl
Sebastian Riedel
Pontus Stenetorp
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension"
44 / 44 papers shown
Title
Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training
Vivian Liu
Yiqiao Yin
40
11
0
01 Apr 2024
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
Jessica Quaye
Alicia Parrish
Oana Inel
Charvi Rastogi
Hannah Rose Kirk
...
Nathan Clement
Rafael Mosquera
Juan Ciro
Vijay Janapa Reddi
Lora Aroyo
31
7
0
14 Feb 2024
How the Advent of Ubiquitous Large Language Models both Stymie and Turbocharge Dynamic Adversarial Question Generation
Yoo Yeon Sung
Ishani Mondal
Jordan L. Boyd-Graber
28
0
0
20 Jan 2024
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Ruida Wang
Wangchunshu Zhou
Mrinmaya Sachan
24
32
0
20 Oct 2023
Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning
Lucas Weber
Elia Bruni
Dieuwke Hupkes
30
24
0
20 Oct 2023
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions
Tim Hartill
N. Tan
Michael Witbrock
Patricia J. Riddle
ReLM
KELM
LRM
27
2
0
02 Aug 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
Lifan Yuan
Yangyi Chen
Ganqu Cui
Hongcheng Gao
Fangyuan Zou
Xingyi Cheng
Heng Ji
Zhiyuan Liu
Maosong Sun
34
73
0
07 Jun 2023
Entailment as Robust Self-Learner
Jiaxin Ge
Hongyin Luo
Yoon Kim
James R. Glass
39
3
0
26 May 2023
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future
Linyi Yang
Y. Song
Xuan Ren
Chenyang Lyu
Yidong Wang
Lingqiao Liu
Jindong Wang
Jennifer Foster
Yue Zhang
OOD
32
2
0
23 May 2023
Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions?
Neeraj Varshney
Mihir Parmar
Nisarg Patel
Divij Handa
Sayantan Sarkar
Man Luo
Chitta Baral
LRM
28
4
0
20 May 2023
A Matter of Annotation: An Empirical Study on In Situ and Self-Recall Activity Annotations from Wearable Sensors
Alexander Hoelzemann
Kristof Van Laerhoven
11
6
0
15 May 2023
Think Twice: Measuring the Efficiency of Eliminating Prediction Shortcuts of Question Answering Models
Lukávs Mikula
Michal vStefánik
Marek Petrovivc
Petr Sojka
33
3
0
11 May 2023
Assessing Language Model Deployment with Risk Cards
Leon Derczynski
Hannah Rose Kirk
Vidhisha Balachandran
Sachin Kumar
Yulia Tsvetkov
M. Leiser
Saif Mohammad
22
42
0
31 Mar 2023
Revealing Weaknesses of Vietnamese Language Models Through Unanswerable Questions in Machine Reading Comprehension
Son Quoc Tran
Phong Nguyen-Thuan Do
Kiet Van Nguyen
N. Nguyen
39
0
0
16 Mar 2023
Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?
Chengwei Qin
Q. Li
Ruochen Zhao
Shafiq R. Joty
VLM
LRM
23
15
0
16 Feb 2023
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
56
98
0
19 Dec 2022
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
20
358
0
19 Dec 2022
RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question
Alireza Mohammadshahi
Thomas Scialom
Majid Yazdani
Pouya Yanki
Angela Fan
James Henderson
Marzieh Saeidi
26
20
0
02 Nov 2022
IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension
Rifki Afina Putri
Alice H. Oh
28
9
0
25 Oct 2022
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
Hung-Ting Chen
Michael J.Q. Zhang
Eunsol Choi
RALM
HILM
36
92
0
25 Oct 2022
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
Tanay Dixit
Bhargavi Paranjape
Hannaneh Hajishirzi
Luke Zettlemoyer
SyDa
140
23
0
10 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
114
93
0
06 Oct 2022
Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt
Seonghyeon Ye
Joel Jang
Doyoung Kim
Yongrae Jo
Minjoon Seo
VLM
29
2
0
06 Oct 2022
Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios
Mana Ashida
Saku Sugawara
59
6
0
16 Sep 2022
Collecting high-quality adversarial data for machine reading comprehension tasks with humans and models in the loop
Damian Y. Romero Diaz
M. Aniol
John M. Culnan
27
0
0
28 Jun 2022
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts
Qinyuan Ye
Juan Zha
Xiang Ren
MoE
18
12
0
25 May 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
29
781
0
16 Apr 2022
What Makes Reading Comprehension Questions Difficult?
Saku Sugawara
Nikita Nangia
Alex Warstadt
Sam Bowman
ELM
RALM
20
13
0
12 Mar 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
Jiacheng Ye
Jiahui Gao
Qintong Li
Hang Xu
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
43
211
0
16 Feb 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
41
211
0
16 Jan 2022
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
Alon Talmor
Ori Yoran
Ronan Le Bras
Chandrasekhar Bhagavatula
Yoav Goldberg
Yejin Choi
Jonathan Berant
ELM
19
140
0
14 Jan 2022
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants
Max Bartolo
Tristan Thrush
Sebastian Riedel
Pontus Stenetorp
Robin Jia
Douwe Kiela
21
33
0
16 Dec 2021
QuALITY: Question Answering with Long Input Texts, Yes!
Richard Yuanzhe Pang
Alicia Parrish
Nitish Joshi
Nikita Nangia
Jason Phang
...
Vishakh Padmakumar
Johnny Ma
Jana Thompson
He He
Sam Bowman
RALM
25
141
0
16 Dec 2021
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
139
130
0
15 Dec 2021
Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Jason Phang
Angelica Chen
William Huang
Samuel R. Bowman
AAML
28
13
0
16 Nov 2021
Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models
Boxin Wang
Chejian Xu
Shuohang Wang
Zhe Gan
Yu Cheng
Jianfeng Gao
Ahmed Hassan Awadallah
B. Li
VLM
ELM
AAML
22
214
0
04 Nov 2021
Retrieval-guided Counterfactual Generation for QA
Bhargavi Paranjape
Matthew Lamm
Ian Tenney
25
31
0
14 Oct 2021
Distantly-Supervised Evidence Retrieval Enables Question Answering without Evidence Annotation
Chen Zhao
Chenyan Xiong
Jordan L. Boyd-Graber
Hal Daumé
RALM
21
8
0
10 Oct 2021
On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study
Divyansh Kaushik
Douwe Kiela
Zachary Chase Lipton
Wen-tau Yih
AAML
11
36
0
02 Jun 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
211
179
0
18 Apr 2021
Cooperative Self-training of Machine Reading Comprehension
Hongyin Luo
Shang-Wen Li
Ming Gao
Seunghak Yu
James R. Glass
SyDa
RALM
15
11
0
12 Mar 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
250
673
0
06 Jan 2021
DynaSent: A Dynamic Benchmark for Sentiment Analysis
Christopher Potts
Zhengxuan Wu
Atticus Geiger
Douwe Kiela
230
77
0
30 Dec 2020
Elastic weight consolidation for better bias inoculation
James Thorne
Andreas Vlachos
17
11
0
29 Apr 2020
1