ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.09010
  4. Cited By
Datasheets for Datasets

Datasheets for Datasets

23 March 2018
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
ArXivPDFHTML

Papers citing "Datasheets for Datasets"

50 / 966 papers shown
Title
Human-Centered Explainable AI (XAI): From Algorithms to User Experiences
Human-Centered Explainable AI (XAI): From Algorithms to User Experiences
Q. V. Liao
R. Varshney
15
221
0
20 Oct 2021
Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To
  Reduce Model Bias
Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model Bias
Sharat Agarwal
Sumanyu Muku
Saket Anand
Chetan Arora
14
12
0
20 Oct 2021
A Framework for Deprecating Datasets: Standardizing Documentation,
  Identification, and Communication
A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication
A. Luccioni
Frances Corry
H. Sridharan
Mike Ananny
J. Schultz
Kate Crawford
46
29
0
18 Oct 2021
Small Data and Process in Data Visualization: The Radical Translations
  Case Study
Small Data and Process in Data Visualization: The Radical Translations Case Study
Arianna Ciula
Miguel Vieira
Ginestra Ferraro
Tiffany Ong
S. Perovic
Rosa Mucignat
Niccolò Valmori
Brecht Deseure
E. Mannucci
6
1
0
18 Oct 2021
RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender
  System
RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System
Kai Wang
Zhene Zou
Minghao Zhao
Qilin Deng
Yue Shang
Yile Liang
Runze Wu
Xudong Shen
Tangjie Lyu
Changjie Fan
OffRL
23
9
0
18 Oct 2021
BEAMetrics: A Benchmark for Language Generation Evaluation Evaluation
BEAMetrics: A Benchmark for Language Generation Evaluation Evaluation
Thomas Scialom
Felix Hill
20
7
0
18 Oct 2021
HumBugDB: A Large-scale Acoustic Mosquito Dataset
HumBugDB: A Large-scale Acoustic Mosquito Dataset
Ivan Kiskin
Marianne E. Sinka
Adam D. Cobb
Waqas Rafique
Lawrence Wang
...
E. Kaindoa
G. Killeen
Eva Herreros-Moya
Katherine J. Willis
Stephen J. Roberts
41
28
0
14 Oct 2021
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Zaid Alyafeai
Maraim Masoud
Mustafa Ghaleb
Maged S. Al-Shaibani
36
25
0
13 Oct 2021
On Releasing Annotator-Level Labels and Information in Datasets
On Releasing Annotator-Level Labels and Information in Datasets
Vinodkumar Prabhakaran
Aida Mostafazadeh Davani
Mark Díaz
17
144
0
12 Oct 2021
We Need to Talk About Data: The Importance of Data Readiness in Natural
  Language Processing
We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing
Fredrik Olsson
Magnus Sahlgren
14
1
0
11 Oct 2021
Chaos as an interpretable benchmark for forecasting and data-driven
  modelling
Chaos as an interpretable benchmark for forecasting and data-driven modelling
W. Gilpin
AI4TS
19
73
0
11 Oct 2021
Exploring constraints on CycleGAN-based CBCT enhancement for adaptive
  radiotherapy
Exploring constraints on CycleGAN-based CBCT enhancement for adaptive radiotherapy
Suraj Pai
MedIm
12
0
0
09 Oct 2021
Inferring Offensiveness In Images From Natural Language Supervision
Inferring Offensiveness In Images From Natural Language Supervision
P. Schramowski
Kristian Kersting
19
2
0
08 Oct 2021
CLEVA-Compass: A Continual Learning EValuation Assessment Compass to
  Promote Research Transparency and Comparability
CLEVA-Compass: A Continual Learning EValuation Assessment Compass to Promote Research Transparency and Comparability
Martin Mundt
Steven Braun
Quentin Delfosse
Kristian Kersting
19
35
0
07 Oct 2021
Trustworthy AI: From Principles to Practices
Trustworthy AI: From Principles to Practices
Bo-wen Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
117
355
0
04 Oct 2021
The VVAD-LRS3 Dataset for Visual Voice Activity Detection
The VVAD-LRS3 Dataset for Visual Voice Activity Detection
Adrian Lubitz
Matias Valdenegro-Toro
Frank Kirchner
13
3
0
28 Sep 2021
Auditing AI models for Verified Deployment under Semantic Specifications
Auditing AI models for Verified Deployment under Semantic Specifications
Homanga Bharadhwaj
De-An Huang
Chaowei Xiao
Anima Anandkumar
Animesh Garg
MLAU
25
6
0
25 Sep 2021
SoK: Machine Learning Governance
SoK: Machine Learning Governance
Varun Chandrasekaran
Hengrui Jia
Anvith Thudi
Adelin Travers
Mohammad Yaghini
Nicolas Papernot
30
16
0
20 Sep 2021
FUTURE-AI: Guiding Principles and Consensus Recommendations for
  Trustworthy Artificial Intelligence in Medical Imaging
FUTURE-AI: Guiding Principles and Consensus Recommendations for Trustworthy Artificial Intelligence in Medical Imaging
Karim Lekadira
Richard Osuala
C. Gallin
Noussair Lazrak
Kaisar Kushibar
...
Nickolas Papanikolaou
Zohaib Salahuddin
Henry C. Woodruff
Philippe Lambin
L. Martí-Bonmatí
AI4TS
63
56
0
20 Sep 2021
Studying Up Machine Learning Data: Why Talk About Bias When We Mean
  Power?
Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?
Milagros Miceli
Julian Posada
Tianling Yang
14
60
0
16 Sep 2021
Data Hunches: Incorporating Personal Knowledge into Visualizations
Data Hunches: Incorporating Personal Knowledge into Visualizations
Haihan Lin
Derya Akbaba
Miriah D. Meyer
A. Lex
30
35
0
15 Sep 2021
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems
  for HPO
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO
Katharina Eggensperger
Philip Muller
Neeratyoy Mallik
Matthias Feurer
René Sass
Aaron Klein
Noor H. Awad
Marius Lindauer
Frank Hutter
30
100
0
14 Sep 2021
Generating Datasets of 3D Garments with Sewing Patterns
Generating Datasets of 3D Garments with Sewing Patterns
Maria Korosteleva
Sung-Hee Lee
6
36
0
12 Sep 2021
Making Online Communities 'Better': A Taxonomy of Community Values on
  Reddit
Making Online Communities 'Better': A Taxonomy of Community Values on Reddit
Galen Cassebeer Weld
Amy X. Zhang
Tim Althoff
53
31
0
11 Sep 2021
Toward a Perspectivist Turn in Ground Truthing for Predictive Computing
Toward a Perspectivist Turn in Ground Truthing for Predictive Computing
Valerio Basile
F. Cabitza
Andrea Campagner
Michael Fell
23
151
0
09 Sep 2021
Datasets: A Community Library for Natural Language Processing
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
24
579
0
07 Sep 2021
MultiEURLEX -- A multi-lingual and multi-label legal document
  classification dataset for zero-shot cross-lingual transfer
MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer
Ilias Chalkidis
Manos Fergadiotis
Ion Androutsopoulos
AILaw
11
106
0
02 Sep 2021
Making the Invisible Visible: Risks and Benefits of Disclosing Metadata
  in Visualization
Making the Invisible Visible: Risks and Benefits of Disclosing Metadata in Visualization
Alyxander Burns
Thai On
C. Lee
R. Shapiro
Cindy Xiong
Narges Mahyar
6
9
0
30 Aug 2021
SHIFT15M: Fashion-specific dataset for set-to-set matching with several
  distribution shifts
SHIFT15M: Fashion-specific dataset for set-to-set matching with several distribution shifts
Masanari Kimura
Takuma Nakamura
Yuki Saito
OOD
31
3
0
30 Aug 2021
A comparison of approaches to improve worst-case predictive model
  performance over patient subpopulations
A comparison of approaches to improve worst-case predictive model performance over patient subpopulations
Stephen R. Pfohl
Haoran Zhang
Yizhe Xu
Agata Foryciarz
Marzyeh Ghassemi
N. Shah
OOD
21
22
0
27 Aug 2021
Sharing Practices for Datasets Related to Accessibility and Aging
Sharing Practices for Datasets Related to Accessibility and Aging
Rie Kamikubo
Utkarsh Dwivedi
Hernisa Kacorri
11
12
0
24 Aug 2021
A Framework for Understanding AI-Induced Field Change: How AI
  Technologies are Legitimized and Institutionalized
A Framework for Understanding AI-Induced Field Change: How AI Technologies are Legitimized and Institutionalized
B. Larsen
11
4
0
18 Aug 2021
Reusable Templates and Guides For Documenting Datasets and Models for
  Natural Language Processing and Generation: A Case Study of the HuggingFace
  and GEM Data and Model Cards
Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards
Angelina McMillan-Major
Salomey Osei
Juan Diego Rodriguez
Pawan Sasanka Ammanamanchi
Sebastian Gehrmann
Yacine Jernite
34
47
0
16 Aug 2021
Presenting an extensive lab- and field-image dataset of crops and weeds
  for computer vision tasks in agriculture
Presenting an extensive lab- and field-image dataset of crops and weeds for computer vision tasks in agriculture
Michael A. Beck
Chen-Yi Liu
C. Bidinosti
C. Henry
Cara M. Godee
Manisha Ajmani
3DV
VLM
20
5
0
12 Aug 2021
Retiring Adult: New Datasets for Fair Machine Learning
Retiring Adult: New Datasets for Fair Machine Learning
Frances Ding
Moritz Hardt
John Miller
Ludwig Schmidt
40
427
0
10 Aug 2021
Do Datasets Have Politics? Disciplinary Values in Computer Vision
  Dataset Development
Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development
M. Scheuerman
Emily L. Denton
A. Hanna
14
203
0
09 Aug 2021
On Measures of Biases and Harms in NLP
On Measures of Biases and Harms in NLP
Sunipa Dev
Emily Sheng
Jieyu Zhao
Aubrie Amstutz
Jiao Sun
...
M. Sanseverino
Jiin Kim
Akihiro Nishi
Nanyun Peng
Kai-Wei Chang
22
80
0
07 Aug 2021
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Kenny Peng
Arunesh Mathur
Arvind Narayanan
97
93
0
06 Aug 2021
An Ethical Framework for Guiding the Development of Affectively-Aware
  Artificial Intelligence
An Ethical Framework for Guiding the Development of Affectively-Aware Artificial Intelligence
Desmond C. Ong
11
28
0
29 Jul 2021
On the state of reporting in crowdsourcing experiments and a checklist
  to aid current practices
On the state of reporting in crowdsourcing experiments and a checklist to aid current practices
Jorge M. Ramírez
Burcu Sayin
Marcos Báez
Fabio Casati
L. Cernuzzi
B. Benatallah
Gianluca Demartini
9
24
0
28 Jul 2021
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering
  and Reading Comprehension
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension
Anna Rogers
Matt Gardner
Isabelle Augenstein
27
163
0
27 Jul 2021
Responsible and Regulatory Conform Machine Learning for Medicine: A
  Survey of Challenges and Solutions
Responsible and Regulatory Conform Machine Learning for Medicine: A Survey of Challenges and Solutions
Eike Petersen
Yannik Potdevin
Esfandiar Mohammadi
Stephan Zidowitz
Sabrina Breyer
...
Sandra Henn
Ludwig Pechmann
M. Leucker
P. Rostalski
Christian Herzog
FaML
AILaw
OOD
19
21
0
20 Jul 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
21
158
0
15 Jul 2021
Shifts: A Dataset of Real Distributional Shift Across Multiple
  Large-Scale Tasks
Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks
A. Malinin
Neil Band
Ganshin
Alexander
German Chesnokov
...
Roginskiy
Denis
Mariya Shmatova
Panos Tigas
Boris Yangel
UQCV
OOD
17
126
0
15 Jul 2021
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
240
590
0
14 Jul 2021
"Garbage In, Garbage Out" Revisited: What Do Machine Learning
  Application Papers Report About Human-Labeled Training Data?
"Garbage In, Garbage Out" Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?
R. Geiger
Dominique Cope
Jamie Ip
Marsha Lotosh
Aayush Shah
Jenny Weng
Rebekah Tang
20
59
0
05 Jul 2021
Exploring Data Pipelines through the Process Lens: a Reference Model
  forComputer Vision
Exploring Data Pipelines through the Process Lens: a Reference Model forComputer Vision
Agathe Balayn
B. Kulynych
S. Guerses
14
4
0
05 Jul 2021
Ethics Sheets for AI Tasks
Ethics Sheets for AI Tasks
Saif M. Mohammad
9
31
0
02 Jul 2021
An Information Retrieval Approach to Building Datasets for Hate Speech
  Detection
An Information Retrieval Approach to Building Datasets for Hate Speech Detection
Md. Mustafizur Rahman
Dinesh Balakrishnan
Dhiraj Murthy
Mucahid Kutlu
Matthew Lease
10
24
0
17 Jun 2021
Modeling Worlds in Text
Modeling Worlds in Text
Prithviraj Ammanabrolu
Mark O. Riedl
VGen
LM&Ro
11
14
0
17 Jun 2021
Previous
123...151617181920
Next