Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.01716
Cited By
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research
3 December 2021
Bernard Koch
Emily L. Denton
A. Hanna
J. Foster
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research"
28 / 78 papers shown
Title
Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation
Sérgio Jesus
José P. Pombal
Duarte M. Alves
André F. Cruz
Pedro Saleiro
Rita P. Ribeiro
João Gama
P. Bizarro
33
32
0
24 Nov 2022
Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval Benchmarks
Pedro Rodriguez
Mahmoud Azab
Becka Silvert
Renato Sanchez
Linzy Labson
Hardik Shah
Seungwhan Moon
33
1
0
10 Oct 2022
The Lifecycle of "Facts": A Survey of Social Bias in Knowledge Graphs
Angelie Kraft
Ricardo Usbeck
KELM
18
9
0
07 Oct 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions
Lingjiao Chen
Zhihua Jin
Sabri Eyuboglu
Christopher Ré
Matei A. Zaharia
James Y. Zou
37
9
0
18 Sep 2022
Bugs in the Data: How ImageNet Misrepresents Biodiversity
A. Luccioni
David Rolnick
19
43
0
24 Aug 2022
Detecting Environmental Violations with Satellite Imagery in Near Real Time: Land Application under the Clean Water Act
Ben Chugg
Nicolas Rothbacher
A. Feng
Xiaoqi Long
Daniel E. Ho
11
2
0
18 Aug 2022
On the role of benchmarking data sets and simulations in method comparison studies
Sarah Friedrich
T. Friede
25
24
0
02 Aug 2022
A Case for Dataset Specific Profiling
Seth Ockerman
John Wu
Christopher Stewart
14
0
0
01 Aug 2022
DataPerf: Benchmarks for Data-Centric AI Development
Mark Mazumder
Colby R. Banbury
Xiaozhe Yao
Bojan Karlavs
W. G. Rojas
...
Carole-Jean Wu
Cody Coleman
Andrew Y. Ng
Peter Mattson
Vijay Janapa Reddi
VLM
33
101
0
20 Jul 2022
The 1st Data Science for Pavements Challenge
Ashkan Behzadian
Tanner Muturi
Tianjie Zhang
Hongseok Kim
A. Mullins
...
D. Mensching
Spragg Robert
M. Corrigan
Jack Youtchef
Dave Eshan
14
7
0
10 Jun 2022
The Algorithmic Imprint
Upol Ehsan
Ranjit Singh
Jacob Metcalf
Mark O. Riedl
FaML
26
31
0
03 Jun 2022
Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets
Philippe Laban
Chien-Sheng Wu
Wenhao Liu
Caiming Xiong
33
5
0
13 May 2022
Evaluation Gaps in Machine Learning Practice
Ben Hutchinson
Negar Rostamzadeh
Christina Greer
Katherine A. Heller
Vinodkumar Prabhakaran
ELM
20
56
0
11 May 2022
AdaCap: Adaptive Capacity control for Feed-Forward Neural Networks
Katia Méziani
Karim Lounici
Benjamin Riu
6
0
0
09 May 2022
Handling and Presenting Harmful Text in NLP Research
Hannah Rose Kirk
Abeba Birhane
Bertie Vidgen
Leon Derczynski
13
47
0
29 Apr 2022
Metaethical Perspectives on 'Benchmarking' AI Ethics
Travis LaCroix
A. Luccioni
25
7
0
11 Apr 2022
Mapping global dynamics of benchmark creation and saturation in artificial intelligence
Simon Ott
A. Barbosa-Silva
Kathrin Blagec
J. Brauner
Matthias Samwald
24
36
0
09 Mar 2022
Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR
Nina Markl
S. McNulty
17
9
0
25 Feb 2022
Increasing Depth of Neural Networks for Life-long Learning
Jkedrzej Kozal
Michal Wo'zniak
CLL
15
8
0
22 Feb 2022
Visual Ground Truth Construction as Faceted Classification
Fausto Giunchiglia
Mayukh Bagchi
Xiaolei Diao
13
5
0
17 Feb 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text
Sebastian Gehrmann
Elizabeth Clark
Thibault Sellam
ELM
AI4CE
58
181
0
14 Feb 2022
Fair ranking: a critical review, challenges, and future directions
Gourab K. Patro
Lorenzo Porcaro
Laura Mitchell
Qiuyue Zhang
Meike Zehlike
Nikhil Garg
13
51
0
29 Jan 2022
A Non-Expert's Introduction to Data Ethics for Mathematicians
M. A. Porter
FaML
14
3
0
18 Jan 2022
Benchmark datasets driving artificial intelligence development fail to capture the needs of medical professionals
Kathrin Blagec
J. Kraiger
Wolfgang Frühwirt
Matthias Samwald
AI4MH
22
26
0
18 Jan 2022
A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication
A. Luccioni
Frances Corry
H. Sridharan
Mike Ananny
J. Schultz
Kate Crawford
38
29
0
18 Oct 2021
Multi-Task Attentive Residual Networks for Argument Mining
Andrea Galassi
Marco Lippi
Paolo Torroni
HAI
9
23
0
24 Feb 2021
Do Question Answering Modeling Improvements Hold Across Benchmarks?
Nelson F. Liu
Tony Lee
Robin Jia
Percy Liang
12
13
0
01 Feb 2021
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
262
10,344
0
12 Dec 2018
Previous
1
2