Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.09010
Cited By
Datasheets for Datasets
23 March 2018
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Datasheets for Datasets"
50 / 966 papers shown
Title
Human-Centered Explainable AI (XAI): From Algorithms to User Experiences
Q. V. Liao
R. Varshney
15
221
0
20 Oct 2021
Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model Bias
Sharat Agarwal
Sumanyu Muku
Saket Anand
Chetan Arora
14
12
0
20 Oct 2021
A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication
A. Luccioni
Frances Corry
H. Sridharan
Mike Ananny
J. Schultz
Kate Crawford
46
29
0
18 Oct 2021
Small Data and Process in Data Visualization: The Radical Translations Case Study
Arianna Ciula
Miguel Vieira
Ginestra Ferraro
Tiffany Ong
S. Perovic
Rosa Mucignat
Niccolò Valmori
Brecht Deseure
E. Mannucci
6
1
0
18 Oct 2021
RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System
Kai Wang
Zhene Zou
Minghao Zhao
Qilin Deng
Yue Shang
Yile Liang
Runze Wu
Xudong Shen
Tangjie Lyu
Changjie Fan
OffRL
23
9
0
18 Oct 2021
BEAMetrics: A Benchmark for Language Generation Evaluation Evaluation
Thomas Scialom
Felix Hill
20
7
0
18 Oct 2021
HumBugDB: A Large-scale Acoustic Mosquito Dataset
Ivan Kiskin
Marianne E. Sinka
Adam D. Cobb
Waqas Rafique
Lawrence Wang
...
E. Kaindoa
G. Killeen
Eva Herreros-Moya
Katherine J. Willis
Stephen J. Roberts
41
28
0
14 Oct 2021
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Zaid Alyafeai
Maraim Masoud
Mustafa Ghaleb
Maged S. Al-Shaibani
36
25
0
13 Oct 2021
On Releasing Annotator-Level Labels and Information in Datasets
Vinodkumar Prabhakaran
Aida Mostafazadeh Davani
Mark Díaz
17
144
0
12 Oct 2021
We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing
Fredrik Olsson
Magnus Sahlgren
14
1
0
11 Oct 2021
Chaos as an interpretable benchmark for forecasting and data-driven modelling
W. Gilpin
AI4TS
19
73
0
11 Oct 2021
Exploring constraints on CycleGAN-based CBCT enhancement for adaptive radiotherapy
Suraj Pai
MedIm
12
0
0
09 Oct 2021
Inferring Offensiveness In Images From Natural Language Supervision
P. Schramowski
Kristian Kersting
19
2
0
08 Oct 2021
CLEVA-Compass: A Continual Learning EValuation Assessment Compass to Promote Research Transparency and Comparability
Martin Mundt
Steven Braun
Quentin Delfosse
Kristian Kersting
19
35
0
07 Oct 2021
Trustworthy AI: From Principles to Practices
Bo-wen Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
117
355
0
04 Oct 2021
The VVAD-LRS3 Dataset for Visual Voice Activity Detection
Adrian Lubitz
Matias Valdenegro-Toro
Frank Kirchner
13
3
0
28 Sep 2021
Auditing AI models for Verified Deployment under Semantic Specifications
Homanga Bharadhwaj
De-An Huang
Chaowei Xiao
Anima Anandkumar
Animesh Garg
MLAU
25
6
0
25 Sep 2021
SoK: Machine Learning Governance
Varun Chandrasekaran
Hengrui Jia
Anvith Thudi
Adelin Travers
Mohammad Yaghini
Nicolas Papernot
30
16
0
20 Sep 2021
FUTURE-AI: Guiding Principles and Consensus Recommendations for Trustworthy Artificial Intelligence in Medical Imaging
Karim Lekadira
Richard Osuala
C. Gallin
Noussair Lazrak
Kaisar Kushibar
...
Nickolas Papanikolaou
Zohaib Salahuddin
Henry C. Woodruff
Philippe Lambin
L. Martí-Bonmatí
AI4TS
63
56
0
20 Sep 2021
Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?
Milagros Miceli
Julian Posada
Tianling Yang
14
60
0
16 Sep 2021
Data Hunches: Incorporating Personal Knowledge into Visualizations
Haihan Lin
Derya Akbaba
Miriah D. Meyer
A. Lex
30
35
0
15 Sep 2021
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO
Katharina Eggensperger
Philip Muller
Neeratyoy Mallik
Matthias Feurer
René Sass
Aaron Klein
Noor H. Awad
Marius Lindauer
Frank Hutter
30
100
0
14 Sep 2021
Generating Datasets of 3D Garments with Sewing Patterns
Maria Korosteleva
Sung-Hee Lee
6
36
0
12 Sep 2021
Making Online Communities 'Better': A Taxonomy of Community Values on Reddit
Galen Cassebeer Weld
Amy X. Zhang
Tim Althoff
53
31
0
11 Sep 2021
Toward a Perspectivist Turn in Ground Truthing for Predictive Computing
Valerio Basile
F. Cabitza
Andrea Campagner
Michael Fell
23
151
0
09 Sep 2021
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
24
579
0
07 Sep 2021
MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer
Ilias Chalkidis
Manos Fergadiotis
Ion Androutsopoulos
AILaw
11
106
0
02 Sep 2021
Making the Invisible Visible: Risks and Benefits of Disclosing Metadata in Visualization
Alyxander Burns
Thai On
C. Lee
R. Shapiro
Cindy Xiong
Narges Mahyar
6
9
0
30 Aug 2021
SHIFT15M: Fashion-specific dataset for set-to-set matching with several distribution shifts
Masanari Kimura
Takuma Nakamura
Yuki Saito
OOD
31
3
0
30 Aug 2021
A comparison of approaches to improve worst-case predictive model performance over patient subpopulations
Stephen R. Pfohl
Haoran Zhang
Yizhe Xu
Agata Foryciarz
Marzyeh Ghassemi
N. Shah
OOD
21
22
0
27 Aug 2021
Sharing Practices for Datasets Related to Accessibility and Aging
Rie Kamikubo
Utkarsh Dwivedi
Hernisa Kacorri
11
12
0
24 Aug 2021
A Framework for Understanding AI-Induced Field Change: How AI Technologies are Legitimized and Institutionalized
B. Larsen
11
4
0
18 Aug 2021
Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards
Angelina McMillan-Major
Salomey Osei
Juan Diego Rodriguez
Pawan Sasanka Ammanamanchi
Sebastian Gehrmann
Yacine Jernite
34
47
0
16 Aug 2021
Presenting an extensive lab- and field-image dataset of crops and weeds for computer vision tasks in agriculture
Michael A. Beck
Chen-Yi Liu
C. Bidinosti
C. Henry
Cara M. Godee
Manisha Ajmani
3DV
VLM
20
5
0
12 Aug 2021
Retiring Adult: New Datasets for Fair Machine Learning
Frances Ding
Moritz Hardt
John Miller
Ludwig Schmidt
40
427
0
10 Aug 2021
Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development
M. Scheuerman
Emily L. Denton
A. Hanna
14
203
0
09 Aug 2021
On Measures of Biases and Harms in NLP
Sunipa Dev
Emily Sheng
Jieyu Zhao
Aubrie Amstutz
Jiao Sun
...
M. Sanseverino
Jiin Kim
Akihiro Nishi
Nanyun Peng
Kai-Wei Chang
22
80
0
07 Aug 2021
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Kenny Peng
Arunesh Mathur
Arvind Narayanan
97
93
0
06 Aug 2021
An Ethical Framework for Guiding the Development of Affectively-Aware Artificial Intelligence
Desmond C. Ong
11
28
0
29 Jul 2021
On the state of reporting in crowdsourcing experiments and a checklist to aid current practices
Jorge M. Ramírez
Burcu Sayin
Marcos Báez
Fabio Casati
L. Cernuzzi
B. Benatallah
Gianluca Demartini
9
24
0
28 Jul 2021
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension
Anna Rogers
Matt Gardner
Isabelle Augenstein
27
163
0
27 Jul 2021
Responsible and Regulatory Conform Machine Learning for Medicine: A Survey of Challenges and Solutions
Eike Petersen
Yannik Potdevin
Esfandiar Mohammadi
Stephan Zidowitz
Sabrina Breyer
...
Sandra Henn
Ludwig Pechmann
M. Leucker
P. Rostalski
Christian Herzog
FaML
AILaw
OOD
19
21
0
20 Jul 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
21
158
0
15 Jul 2021
Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks
A. Malinin
Neil Band
Ganshin
Alexander
German Chesnokov
...
Roginskiy
Denis
Mariya Shmatova
Panos Tigas
Boris Yangel
UQCV
OOD
17
126
0
15 Jul 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
240
590
0
14 Jul 2021
"Garbage In, Garbage Out" Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?
R. Geiger
Dominique Cope
Jamie Ip
Marsha Lotosh
Aayush Shah
Jenny Weng
Rebekah Tang
20
59
0
05 Jul 2021
Exploring Data Pipelines through the Process Lens: a Reference Model forComputer Vision
Agathe Balayn
B. Kulynych
S. Guerses
14
4
0
05 Jul 2021
Ethics Sheets for AI Tasks
Saif M. Mohammad
9
31
0
02 Jul 2021
An Information Retrieval Approach to Building Datasets for Hate Speech Detection
Md. Mustafizur Rahman
Dinesh Balakrishnan
Dhiraj Murthy
Mucahid Kutlu
Matthew Lease
10
24
0
17 Jun 2021
Modeling Worlds in Text
Prithviraj Ammanabrolu
Mark O. Riedl
VGen
LM&Ro
11
14
0
17 Jun 2021
Previous
1
2
3
...
15
16
17
18
19
20
Next