ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.10044
  4. Cited By
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

24 May 2019
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
ArXivPDFHTML

Papers citing "BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions"

50 / 1,036 papers shown
Title
Corpus-Level Evaluation for Event QA: The IndiaPoliceEvents Corpus
  Covering the 2002 Gujarat Violence
Corpus-Level Evaluation for Event QA: The IndiaPoliceEvents Corpus Covering the 2002 Gujarat Violence
Andrew Halterman
Katherine A. Keith
Sheikh Muhammad Sarwar
Brendan T. O'Connor
14
27
0
27 May 2021
True Few-Shot Learning with Language Models
True Few-Shot Learning with Language Models
Ethan Perez
Douwe Kiela
Kyunghyun Cho
13
428
0
24 May 2021
KLUE: Korean Language Understanding Evaluation
KLUE: Korean Language Understanding Evaluation
Sungjoon Park
Jihyung Moon
Sungdong Kim
Won Ik Cho
Jiyoon Han
...
Seonghyun Kim
Lucy Park
Alice H. Oh
Jung-Woo Ha
Kyunghyun Cho
ELM
VLM
16
191
0
20 May 2021
Enhancing Transformers with Gradient Boosted Decision Trees for NLI
  Fine-Tuning
Enhancing Transformers with Gradient Boosted Decision Trees for NLI Fine-Tuning
Benjamin Minixhofer
Milan Gritta
Ignacio Iacobacci
AI4CE
6
5
0
08 May 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
  The Role of Sample Size and Dimensionality
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya V Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. A. Schwartz
17
35
0
07 May 2021
Entailment as Few-Shot Learner
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
30
183
0
29 Apr 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
209
179
0
18 Apr 2021
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean
  Crawled Corpus
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
Jesse Dodge
Maarten Sap
Ana Marasović
William Agnew
Gabriel Ilharco
Dirk Groeneveld
Margaret Mitchell
Matt Gardner
AILaw
23
424
0
18 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,844
0
18 Apr 2021
Competency Problems: On Finding and Removing Artifacts in Language Data
Competency Problems: On Finding and Removing Artifacts in Language Data
Matt Gardner
William Merrill
Jesse Dodge
Matthew E. Peters
Alexis Ross
Sameer Singh
Noah A. Smith
161
107
0
17 Apr 2021
Surface Form Competition: Why the Highest Probability Answer Isn't
  Always Right
Surface Form Competition: Why the Highest Probability Answer Isn't Always Right
Ari Holtzman
Peter West
Vered Schwartz
Yejin Choi
Luke Zettlemoyer
LRM
20
230
0
16 Apr 2021
What to Pre-Train on? Efficient Intermediate Task Selection
What to Pre-Train on? Efficient Intermediate Task Selection
Clifton A. Poth
Jonas Pfeiffer
Andreas Rucklé
Iryna Gurevych
6
94
0
16 Apr 2021
Multivalent Entailment Graphs for Question Answering
Multivalent Entailment Graphs for Question Answering
Nick McKenna
Liane Guillou
Mohammad Javad Hosseini
Sander Bijl de Vroe
Mark Johnson
Mark Steedman
NAI
16
14
0
16 Apr 2021
Sequence tagging for biomedical extractive question answering
Sequence tagging for biomedical extractive question answering
Wonjin Yoon
Richard Jackson
Aron Lagerberg
Jaewoo Kang
MedIm
8
26
0
15 Apr 2021
Does Putting a Linguist in the Loop Improve NLU Data Collection?
Does Putting a Linguist in the Loop Improve NLU Data Collection?
Alicia Parrish
William Huang
Omar Agha
Soo-hwan Lee
Nikita Nangia
Alex Warstadt
Karmanya Aggarwal
Emily Allaway
Tal Linzen
Samuel R. Bowman
17
40
0
15 Apr 2021
TWEAC: Transformer with Extendable QA Agent Classifiers
TWEAC: Transformer with Extendable QA Agent Classifiers
Gregor Geigle
Nils Reimers
Andreas Rucklé
Iryna Gurevych
ViT
11
22
0
14 Apr 2021
Structural analysis of an all-purpose question answering model
Structural analysis of an all-purpose question answering model
Vincent Micheli
Quentin Heinrich
Franccois Fleuret
Wacim Belblidia
10
3
0
13 Apr 2021
MultiModalQA: Complex Question Answering over Text, Tables and Images
MultiModalQA: Complex Question Answering over Text, Tables and Images
Alon Talmor
Ori Yoran
Amnon Catav
Dan Lahav
Yizhong Wang
Akari Asai
Gabriel Ilharco
Hannaneh Hajishirzi
Jonathan Berant
LMTD
17
148
0
13 Apr 2021
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
Roshanak Mirzaee
Hossein Rajaby Faghihi
Qiang Ning
Parisa Kordjmashidi
10
76
0
12 Apr 2021
Achieving Model Robustness through Discrete Adversarial Training
Achieving Model Robustness through Discrete Adversarial Training
Maor Ivgi
Jonathan Berant
AAML
14
27
0
11 Apr 2021
Adapting Language Models for Zero-shot Learning by Meta-tuning on
  Dataset and Prompt Collections
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Ruiqi Zhong
Kristy Lee
Zheng-Wei Zhang
Dan Klein
20
166
0
10 Apr 2021
Connecting Attributions and QA Model Behavior on Realistic
  Counterfactuals
Connecting Attributions and QA Model Behavior on Realistic Counterfactuals
Xi Ye
Rohan Nair
Greg Durrett
16
24
0
09 Apr 2021
AmbiFC: Fact-Checking Ambiguous Claims with Evidence
AmbiFC: Fact-Checking Ambiguous Claims with Evidence
Max Glockner
Ieva Staliunaite
James Thorne
Gisela Vallejo
Andreas Vlachos
Iryna Gurevych
29
21
0
01 Apr 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New
  Multitask Benchmark
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
22
137
0
24 Mar 2021
Improving and Simplifying Pattern Exploiting Training
Improving and Simplifying Pattern Exploiting Training
Derek Tam
Rakesh R Menon
Mohit Bansal
Shashank Srivastava
Colin Raffel
13
149
0
22 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
40
1,145
0
18 Mar 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
49
296
0
15 Mar 2021
DOCENT: Learning Self-Supervised Entity Representations from Large
  Document Collections
DOCENT: Learning Self-Supervised Entity Representations from Large Document Collections
Yury Zemlyanskiy
Sudeep Gandhe
Ruining He
Bhargav Kanagal
Anirudh Ravula
Juraj Gottweis
Fei Sha
Ilya Eckstein
SSL
26
11
0
26 Feb 2021
Muppet: Massive Multi-task Representations with Pre-Finetuning
Muppet: Massive Multi-task Representations with Pre-Finetuning
Armen Aghajanyan
Anchit Gupta
Akshat Shrivastava
Xilun Chen
Luke Zettlemoyer
Sonal Gupta
22
266
0
26 Jan 2021
English Machine Reading Comprehension Datasets: A Survey
English Machine Reading Comprehension Datasets: A Survey
Daria Dzendzik
Carl Vogel
Jennifer Foster
RALM
AIMat
25
49
0
25 Jan 2021
Unanswerable Questions about Images and Texts
Unanswerable Questions about Images and Texts
E. Davis
37
12
0
25 Jan 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
245
672
0
06 Jan 2021
Retrieving and Reading: A Comprehensive Survey on Open-domain Question
  Answering
Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering
Fengbin Zhu
Wenqiang Lei
Chao Wang
Jianming Zheng
Soujanya Poria
Tat-Seng Chua
RALM
213
251
0
04 Jan 2021
FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale
  Generation
FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Kushal Lakhotia
Bhargavi Paranjape
Asish Ghoshal
Wen-tau Yih
Yashar Mehdad
Srini Iyer
17
27
0
31 Dec 2020
Learning from Mistakes: Using Mis-predictions as Harm Alerts in Language
  Pre-Training
Learning from Mistakes: Using Mis-predictions as Harm Alerts in Language Pre-Training
Chen Xing
Wenhao Liu
Caiming Xiong
17
0
0
16 Dec 2020
Reference Knowledgeable Network for Machine Reading Comprehension
Reference Knowledgeable Network for Machine Reading Comprehension
Yilin Zhao
Zhuosheng Zhang
Hai Zhao
10
5
0
07 Dec 2020
Learning from Task Descriptions
Learning from Task Descriptions
Orion Weller
Nicholas Lourie
Matt Gardner
Matthew E. Peters
43
89
0
16 Nov 2020
When Do You Need Billions of Words of Pretraining Data?
When Do You Need Billions of Words of Pretraining Data?
Yian Zhang
Alex Warstadt
Haau-Sing Li
Samuel R. Bowman
21
136
0
10 Nov 2020
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
Tatiana Shavrina
Alena Fenogenova
Anton A. Emelyanov
Denis Shevelev
Ekaterina Artemova
Valentin Malykh
Vladislav Mikhailov
Maria Tikhonova
Andrey Chertok
Andrey Evlampiev
VLM
ELM
25
81
0
29 Oct 2020
Measuring Association Between Labels and Free-Text Rationales
Measuring Association Between Labels and Free-Text Rationales
Sarah Wiegreffe
Ana Marasović
Noah A. Smith
274
170
0
24 Oct 2020
Optimal Subarchitecture Extraction For BERT
Optimal Subarchitecture Extraction For BERT
Adrian de Wynter
Daniel J. Perry
MQ
43
18
0
20 Oct 2020
Evaluating and Characterizing Human Rationales
Evaluating and Characterizing Human Rationales
Samuel Carton
Anirudh Rathore
Chenhao Tan
8
48
0
09 Oct 2020
MOCHA: A Dataset for Training and Evaluating Generative Reading
  Comprehension Metrics
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics
Anthony Chen
Gabriel Stanovsky
Sameer Singh
Matt Gardner
19
50
0
07 Oct 2020
"I'd rather just go to bed": Understanding Indirect Answers
"I'd rather just go to bed": Understanding Indirect Answers
Annie Louis
Dan Roth
Filip Radlinski
6
43
0
07 Oct 2020
A Review on Fact Extraction and Verification
A Review on Fact Extraction and Verification
Giannis Bekoulis
Christina Papagiannopoulou
Nikos Deligiannis
25
41
0
06 Oct 2020
DaNetQA: a yes/no Question Answering Dataset for the Russian Language
DaNetQA: a yes/no Question Answering Dataset for the Russian Language
T. Glushkova
Alexey Machnev
Alena Fenogenova
Tatiana Shavrina
Ekaterina Artemova
D. Ignatov
11
10
0
06 Oct 2020
Universal Natural Language Processing with Limited Annotations: Try
  Few-shot Textual Entailment as a Start
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start
Wenpeng Yin
Nazneen Rajani
Dragomir R. Radev
R. Socher
Caiming Xiong
21
68
0
06 Oct 2020
Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq
Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq
Qiang Ning
Hao Wu
Pradeep Dasigi
Dheeru Dua
Matt Gardner
Robert L Logan IV
Ana Marasović
Zhenjin Nie
28
16
0
06 Oct 2020
Which *BERT? A Survey Organizing Contextualized Encoders
Which *BERT? A Survey Organizing Contextualized Encoders
Patrick Xia
Shijie Wu
Benjamin Van Durme
26
50
0
02 Oct 2020
Understanding tables with intermediate pre-training
Understanding tables with intermediate pre-training
Julian Martin Eisenschlos
Syrine Krichene
Thomas Müller
LMTD
13
119
0
01 Oct 2020
Previous
123...192021
Next