v1v2v3v4v5v6v7v8 (latest)

Datasheets for Datasets

23 March 2018

Timnit Gebru

Jamie Morgenstern

Briana Vecchione

Jennifer Wortman Vaughan

Papers citing "Datasheets for Datasets"

50 / 1,069 papers shown

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

152

04 Nov 2021

Feature and Label Embedding Spaces Matter in Addressing Image Classifier Bias

William Thong

Cees G. M. Snoek

148

27 Oct 2021

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning

Xiaodan Liang

388

262

25 Oct 2021

What Would Jiminy Cricket Do? Towards Agents That Behave Morally

242

25 Oct 2021

Human-Centered Explainable AI (XAI): From Algorithms to User Experiences

Q. V. Liao

R. Varshney

549

283

20 Oct 2021

Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model BiasIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021

161

20 Oct 2021

A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication

348

18 Oct 2021

Small Data and Process in Data Visualization: The Radical Translations Case Study

18 Oct 2021

RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System

Changjie Fan

317

18 Oct 2021

BEAMetrics: A Benchmark for Language Generation Evaluation Evaluation

Thomas Scialom

Felix Hill

157

18 Oct 2021

HumBugDB: A Large-scale Acoustic Mosquito Dataset

Ivan Kiskin

...

173

14 Oct 2021

Masader: Metadata Sourcing for Arabic Text and Speech Data Resources

334

13 Oct 2021

On Releasing Annotator-Level Labels and Information in DatasetsLaw (LAW), 2021

Vinodkumar Prabhakaran

Aida Mostafazadeh Davani

Mark Díaz

252

170

12 Oct 2021

We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing

Fredrik Olsson

Magnus Sahlgren

125

11 Oct 2021

Chaos as an interpretable benchmark for forecasting and data-driven modelling

W. Gilpin

AI4TS

293

106

11 Oct 2021

Exploring constraints on CycleGAN-based CBCT enhancement for adaptive radiotherapy

Suraj Pai

MedIm

09 Oct 2021

Inferring Offensiveness In Images From Natural Language Supervision

P. Schramowski

Kristian Kersting

08 Oct 2021

CLEVA-Compass: A Continual Learning EValuation Assessment Compass to Promote Research Transparency and Comparability

243

07 Oct 2021

Trustworthy AI: From Principles to Practices

473

520

04 Oct 2021

The VVAD-LRS3 Dataset for Visual Voice Activity Detection

Adrian Lubitz

Matias Valdenegro-Toro

Frank Kirchner

161

28 Sep 2021

Auditing AI models for Verified Deployment under Semantic Specifications

Homanga Bharadhwaj

De-An Huang

188

25 Sep 2021

SoK: Machine Learning Governance

271

20 Sep 2021

FUTURE-AI: Guiding Principles and Consensus Recommendations for Trustworthy Artificial Intelligence in Medical Imaging

...

Nickolas Papanikolaou

359

20 Sep 2021

Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?

Milagros Miceli

Julian Posada

Tianling Yang

116

16 Sep 2021

Data Hunches: Incorporating Personal Knowledge into Visualizations

173

15 Sep 2021

HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO

Katharina Eggensperger

Katharina Eggensperger

437

122

14 Sep 2021

Generating Datasets of 3D Garments with Sewing Patterns

Maria Korosteleva

Sung-Hee Lee

185

12 Sep 2021

Making Online Communities 'Better': A Taxonomy of Community Values on Reddit

Galen Cassebeer Weld

Amy X. Zhang

Tim Althoff

299

11 Sep 2021

Toward a Perspectivist Turn in Ground Truthing for Predictive ComputingAAAI Conference on Artificial Intelligence (AAAI), 2021

264

207

09 Sep 2021

Datasets: A Community Library for Natural Language ProcessingConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Quentin Lhoest

Albert Villanova del Moral

...

584

705

07 Sep 2021

MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

392

133

02 Sep 2021

Making the Invisible Visible: Risks and Benefits of Disclosing Metadata in Visualization

144

30 Aug 2021

SHIFT15M: Fashion-specific dataset for set-to-set matching with several distribution shifts

Masanari Kimura

Takuma Nakamura

Yuki Saito

OOD

217

30 Aug 2021

A comparison of approaches to improve worst-case predictive model performance over patient subpopulationsScientific Reports (Sci Rep), 2021

289

27 Aug 2021

Sharing Practices for Datasets Related to Accessibility and AgingInternational ACM SIGACCESS Conference on Computers and Accessibility (ASSETS), 2021

Rie Kamikubo

Utkarsh Dwivedi

Hernisa Kacorri

150

24 Aug 2021

A Framework for Understanding AI-Induced Field Change: How AI Technologies are Legitimized and Institutionalized

B. Larsen

18 Aug 2021

Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards

Angelina McMillan-Major

Salomey Osei

Juan Diego Rodriguez

Pawan Sasanka Ammanamanchi

Sebastian Gehrmann

Yacine Jernite

164

16 Aug 2021

Presenting an extensive lab- and field-image dataset of crops and weeds for computer vision tasks in agriculture

12 Aug 2021

Retiring Adult: New Datasets for Fair Machine LearningNeural Information Processing Systems (NeurIPS), 2021

451

544

10 Aug 2021

Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development

M. Scheuerman

Emily L. Denton

A. Hanna

228

242

09 Aug 2021

On Measures of Biases and Harms in NLP

...

244

108

07 Aug 2021

Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers

Kenny Peng

Arunesh Mathur

Arvind Narayanan

349

106

06 Aug 2021

An Ethical Framework for Guiding the Development of Affectively-Aware Artificial IntelligenceAffective Computing and Intelligent Interaction (ACII), 2021

Desmond C. Ong

29 Jul 2021

On the state of reporting in crowdsourcing experiments and a checklist to aid current practices

206

28 Jul 2021

QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading ComprehensionACM Computing Surveys (CSUR), 2021

Anna Rogers

Matt Gardner

Isabelle Augenstein

375

191

27 Jul 2021

Responsible and Regulatory Conform Machine Learning for Medicine: A Survey of Challenges and SolutionsIEEE Access (IEEE Access), 2021

...

275

20 Jul 2021

MultiBench: Multiscale Benchmarks for Multimodal Representation Learning

...

Peter Wu

Michelle A. Lee

Yuke Zhu

Ruslan Salakhutdinov

Louis-Philippe Morency

VLM

276

223

15 Jul 2021

Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks

...

483

147

15 Jul 2021

Deduplicating Training Data Makes Language Models Better

717

770

14 Jul 2021

"Garbage In, Garbage Out" Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?

149

05 Jul 2021