Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.10795
Cited By
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
22 September 2020
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics"
36 / 86 papers shown
Title
TiDAL: Learning Training Dynamics for Active Learning
Seong Min Kye
Kwanghee Choi
Hyeongmin Byun
Buru Chang
21
13
0
13 Oct 2022
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
Tanay Dixit
Bhargavi Paranjape
Hannaneh Hajishirzi
Luke Zettlemoyer
SyDa
135
23
0
10 Oct 2022
PROD: Progressive Distillation for Dense Retrieval
Zhenghao Lin
Yeyun Gong
Xiao Liu
Hang Zhang
Chen Lin
...
Jian Jiao
Jing Lu
Daxin Jiang
Rangan Majumder
Nan Duan
17
27
0
27 Sep 2022
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
30
27
0
20 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
The Value of Out-of-Distribution Data
Ashwin De Silva
Rahul Ramesh
Carey E. Priebe
Pratik Chaudhari
Joshua T. Vogelstein
OODD
14
11
0
23 Aug 2022
Evaluating and Crafting Datasets Effective for Deep Learning With Data Maps
Jay Bishnu
Andrew Gondoputro
11
1
0
22 Aug 2022
An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs
Jiarui Zhang
Filip Ilievski
Kaixin Ma
Jonathan M Francis
A. Oltramari
SSL
16
5
0
21 May 2022
ALLSH: Active Learning Guided by Local Sensitivity and Hardness
Shujian Zhang
Chengyue Gong
Xingchao Liu
Pengcheng He
Weizhu Chen
Mingyuan Zhou
25
26
0
10 May 2022
A Data Cartography based MixUp for Pre-trained Language Models
Seohong Park
Cornelia Caragea
11
6
0
06 May 2022
Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees
Jonathan Brophy
Zayd Hammoudeh
Daniel Lowd
TDI
8
22
0
30 Apr 2022
On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Roy Schwartz
Gabriel Stanovsky
19
24
0
27 Apr 2022
Adaptor: Objective-Centric Adaptation Framework for Language Models
Michal vStefánik
Vít Novotný
Nikola Groverová
Petr Sojka
20
10
0
08 Mar 2022
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts
Weixin Liang
James Y. Zou
OOD
26
81
0
14 Feb 2022
FORML: Learning to Reweight Data for Fairness
Bobby Yan
Skyler Seto
N. Apostoloff
FaML
6
11
0
03 Feb 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
30
212
0
16 Jan 2022
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
Alon Talmor
Ori Yoran
Ronan Le Bras
Chandrasekhar Bhagavatula
Yoav Goldberg
Yejin Choi
Jonathan Berant
ELM
16
140
0
14 Jan 2022
On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training
Chen Liu
Zhichao Huang
Mathieu Salzmann
Tong Zhang
Sabine Süsstrunk
AAML
13
13
0
14 Dec 2021
Dataset Geography: Mapping Language Data to Language Users
Fahim Faisal
Yinkai Wang
Antonios Anastasopoulos
54
23
0
07 Dec 2021
Multi-View Active Learning for Short Text Classification in User-Generated Data
Payam Karisani
Negin Karisani
Li Xiong
VLM
11
4
0
05 Dec 2021
Understanding Out-of-distribution: A Perspective of Data Dynamics
Dyah Adila
Dongyeop Kang
18
12
0
29 Nov 2021
Clean or Annotate: How to Spend a Limited Data Collection Budget
Derek Chen
Zhou Yu
Samuel R. Bowman
27
13
0
15 Oct 2021
Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks
Chin-Chia Michael Yeh
Zhongfang Zhuang
Junpeng Wang
Yan Zheng
J. Ebrahimi
Ryan Mercer
Liang Wang
Wei Zhang
AI4TS
16
4
0
21 Sep 2021
Training Dynamic based data filtering may not work for NLP datasets
Arka Talukdar
Monika Dagar
Prachi Gupta
Varun G. Menon
NoLa
27
3
0
19 Sep 2021
The Grammar-Learning Trajectories of Neural Language Models
Leshem Choshen
Guy Hacohen
D. Weinshall
Omri Abend
12
28
0
13 Sep 2021
Assessing the Quality of the Datasets by Identifying Mislabeled Samples
Vaibhav Pulastya
Gaurav Nuti
Yash Kumar Atri
Tanmoy Chakraborty
NoLa
25
5
0
10 Sep 2021
Cartography Active Learning
Mike Zhang
Barbara Plank
19
37
0
09 Sep 2021
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
Yasumasa Onoe
Michael J.Q. Zhang
Eunsol Choi
Greg Durrett
HILM
15
85
0
03 Sep 2021
Contrastive Explanations for Model Interpretability
Alon Jacovi
Swabha Swayamdipta
Shauli Ravfogel
Yanai Elazar
Yejin Choi
Yoav Goldberg
22
94
0
02 Mar 2021
Latent Adversarial Debiasing: Mitigating Collider Bias in Deep Neural Networks
L. N. Darlow
Stanisław Jastrzębski
Amos Storkey
38
24
0
19 Nov 2020
ANLIzing the Adversarial Natural Language Inference Dataset
Adina Williams
Tristan Thrush
Douwe Kiela
AAML
166
45
0
24 Oct 2020
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
Tal Linzen
216
188
0
03 May 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
268
5,652
0
05 Dec 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
247
9,109
0
06 Jun 2015
Previous
1
2