A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication

18 October 2021

Papers citing "A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication"

6 / 6 papers shown

Title
On the Readiness of Scientific Data for a Fair and Transparent Use in Machine Learning Joan Giner-Miguelez Abel Gómez Jordi Cabot 11 0 0 18 Jan 2024
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements Leandro von Werra Lewis Tunstall A. Thakur A. Luccioni Tristan Thrush ... Julien Chaumond Margaret Mitchell Alexander M. Rush Thomas Wolf Douwe Kiela ELM 12 24 0 30 Sep 2022
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers Kenny Peng Arunesh Mathur Arvind Narayanan 97 92 0 06 Aug 2021
Deduplicating Training Data Makes Language Models Better Katherine Lee Daphne Ippolito A. Nystrom Chiyuan Zhang Douglas Eck Chris Callison-Burch Nicholas Carlini SyDa 234 447 0 14 Jul 2021
Extracting Training Data from Large Language Models Nicholas Carlini Florian Tramèr Eric Wallace Matthew Jagielski Ariel Herbert-Voss ... Tom B. Brown D. Song Ulfar Erlingsson Alina Oprea Colin Raffel MLAU SILM 264 1,798 0 14 Dec 2020
Machine Unlearning: Linear Filtration for Logit-based Classifiers Thomas Baumhauer Pascal Schöttle Matthias Zeppelzauer MU 99 109 0 07 Feb 2020