Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research

3 December 2021

Papers citing "Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research"

28 / 78 papers shown

Title
Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation Sérgio Jesus José P. Pombal Duarte M. Alves André F. Cruz Pedro Saleiro Rita P. Ribeiro João Gama P. Bizarro 33 32 0 24 Nov 2022
Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval Benchmarks Pedro Rodriguez Mahmoud Azab Becka Silvert Renato Sanchez Linzy Labson Hardik Shah Seungwhan Moon 33 1 0 10 Oct 2022
The Lifecycle of "Facts": A Survey of Social Bias in Knowledge Graphs Angelie Kraft Ricardo Usbeck KELM 18 9 0 07 Oct 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions Lingjiao Chen Zhihua Jin Sabri Eyuboglu Christopher Ré Matei A. Zaharia James Y. Zou 37 9 0 18 Sep 2022
Bugs in the Data: How ImageNet Misrepresents Biodiversity A. Luccioni David Rolnick 19 43 0 24 Aug 2022
Detecting Environmental Violations with Satellite Imagery in Near Real Time: Land Application under the Clean Water Act Ben Chugg Nicolas Rothbacher A. Feng Xiaoqi Long Daniel E. Ho 11 2 0 18 Aug 2022
On the role of benchmarking data sets and simulations in method comparison studies Sarah Friedrich T. Friede 25 24 0 02 Aug 2022
A Case for Dataset Specific Profiling Seth Ockerman John Wu Christopher Stewart 14 0 0 01 Aug 2022
DataPerf: Benchmarks for Data-Centric AI Development Mark Mazumder Colby R. Banbury Xiaozhe Yao Bojan Karlavs W. G. Rojas ... Carole-Jean Wu Cody Coleman Andrew Y. Ng Peter Mattson Vijay Janapa Reddi VLM 33 101 0 20 Jul 2022
The 1st Data Science for Pavements Challenge Ashkan Behzadian Tanner Muturi Tianjie Zhang Hongseok Kim A. Mullins ... D. Mensching Spragg Robert M. Corrigan Jack Youtchef Dave Eshan 14 7 0 10 Jun 2022
The Algorithmic Imprint Upol Ehsan Ranjit Singh Jacob Metcalf Mark O. Riedl FaML 26 31 0 03 Jun 2022
Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets Philippe Laban Chien-Sheng Wu Wenhao Liu Caiming Xiong 33 5 0 13 May 2022
Evaluation Gaps in Machine Learning Practice Ben Hutchinson Negar Rostamzadeh Christina Greer Katherine A. Heller Vinodkumar Prabhakaran ELM 20 56 0 11 May 2022
AdaCap: Adaptive Capacity control for Feed-Forward Neural Networks Katia Méziani Karim Lounici Benjamin Riu 6 0 0 09 May 2022
Handling and Presenting Harmful Text in NLP Research Hannah Rose Kirk Abeba Birhane Bertie Vidgen Leon Derczynski 13 47 0 29 Apr 2022
Metaethical Perspectives on 'Benchmarking' AI Ethics Travis LaCroix A. Luccioni 25 7 0 11 Apr 2022
Mapping global dynamics of benchmark creation and saturation in artificial intelligence Simon Ott A. Barbosa-Silva Kathrin Blagec J. Brauner Matthias Samwald 24 36 0 09 Mar 2022
Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR Nina Markl S. McNulty 17 9 0 25 Feb 2022
Increasing Depth of Neural Networks for Life-long Learning Jkedrzej Kozal Michal Wo'zniak CLL 15 8 0 22 Feb 2022
Visual Ground Truth Construction as Faceted Classification Fausto Giunchiglia Mayukh Bagchi Xiaolei Diao 13 5 0 17 Feb 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text Sebastian Gehrmann Elizabeth Clark Thibault Sellam ELM AI4CE 58 181 0 14 Feb 2022
Fair ranking: a critical review, challenges, and future directions Gourab K. Patro Lorenzo Porcaro Laura Mitchell Qiuyue Zhang Meike Zehlike Nikhil Garg 13 51 0 29 Jan 2022
A Non-Expert's Introduction to Data Ethics for Mathematicians M. A. Porter FaML 14 3 0 18 Jan 2022
Benchmark datasets driving artificial intelligence development fail to capture the needs of medical professionals Kathrin Blagec J. Kraiger Wolfgang Frühwirt Matthias Samwald AI4MH 22 26 0 18 Jan 2022
A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication A. Luccioni Frances Corry H. Sridharan Mike Ananny J. Schultz Kate Crawford 38 29 0 18 Oct 2021
Multi-Task Attentive Residual Networks for Argument Mining Andrea Galassi Marco Lippi Paolo Torroni HAI 9 23 0 24 Feb 2021
Do Question Answering Modeling Improvements Hold Across Benchmarks? Nelson F. Liu Tony Lee Robin Jia Percy Liang 12 13 0 01 Feb 2021
A Style-Based Generator Architecture for Generative Adversarial Networks Tero Karras S. Laine Timo Aila 262 10,344 0 12 Dec 2018