Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1803.09010
Cited By
v1
v2
v3
v4
v5
v6
v7
v8 (latest)
Datasheets for Datasets
23 March 2018
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Datasheets for Datasets"
50 / 1,069 papers shown
Title
A Feminist Account of Intersectional Algorithmic Fairness
Marie Mirsch
Laila Wegner
Jonas Strube
Carmen Leicht-Scholten
FaML
180
0
0
25 Aug 2025
EmoTale: An Enacted Speech-emotion Dataset in Danish
Maja J. Hjuler
Harald V. Skat-Rørdam
Line H. Clemmensen
Sneha Das
76
1
0
20 Aug 2025
Assessing Trustworthiness of AI Training Dataset using Subjective Logic -- A Use Case on Bias
Koffi Ismael Ouattara
Ioannis Krontiris
Theo Dimitrakos
Frank Kargl
96
3
0
19 Aug 2025
OPTIC-ER: A Reinforcement Learning Framework for Real-Time Emergency Response and Equitable Resource Allocation in Underserved African Communities
Mary Tonwe
144
0
0
18 Aug 2025
Documenting Deployment with Fabric: A Repository of Real-World AI Governance
Mackenzie Jorgensen
Kendall Brogle
Katherine M. Collins
Lujain Ibrahim
Arina Shah
...
Paul Dongha
Hatim Abdulhussein
Adrian Weller
Jillian Powers
Umang Bhatt
182
0
0
18 Aug 2025
Beyond Internal Data: Bounding and Estimating Fairness from Incomplete Data
Varsha Ramineni
Hossein A. Rahmani
Emine Yilmaz
David Barber
112
0
0
18 Aug 2025
Street Review: A Participatory AI-Based Framework for Assessing Streetscape Inclusivity
Cities (Cities), 2025
Rashid Mushkani
Shin Koseki
189
7
0
14 Aug 2025
TechOps: Technical Documentation Templates for the AI Act
Laura Lucaj
Alex Loosley
Hakan Jonsson
Urs Gasser
Patrick van der Smagt
76
1
0
12 Aug 2025
Towards Experience-Centered AI: A Framework for Integrating Lived Experience in Design and Development
Sanjana Gautam
Mohit Chandra
Ankolika De
Tatiana Chakravorti
Girik Malik
M. D. Choudhury
74
0
0
09 Aug 2025
Dynaword: From One-shot to Continuously Developed Datasets
Kenneth Enevoldsen
Kristian Nørgaard Jensen
Jan Kostkan
Balázs Szabó
Márton Kardos
...
Per Møldrup Dalum
Desmond Elliott
Lukas Galke
Peter Schneider-Kamp
Kristoffer Nielbo
126
0
0
04 Aug 2025
OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption Models
Monika Wysoczańska
Shyamal Buch
Anurag Arnab
Cordelia Schmid
HILM
168
0
0
25 Jul 2025
Beyond Internal Data: Constructing Complete Datasets for Fairness Testing
Varsha Ramineni
Hossein A. Rahmani
Emine Yilmaz
David Barber
126
0
0
24 Jul 2025
CNS-Bench: Benchmarking Image Classifier Robustness Under Continuous Nuisance Shifts
Olaf Dünkel
Artur Jesslen
Jiahao Xie
Christian Theobalt
Christian Rupprecht
Adam Kortylewski
DiffM
188
0
0
23 Jul 2025
Characterizing Online Activities Contributing to Suicide Mortality among Youth
Aparna Ananthasubramaniam
Elyse J. Thulin
Viktoryia Kalesnikava
Silas Falde
Jonathan Kertawidjaja
Lily Johns
Alejandro Rodríguez-Putnam
Emma Spring
Kara Zivin
Briana Mezuk
LRM
71
0
0
22 Jul 2025
Predictive Representativity: Uncovering Racial Bias in AI-based Skin Cancer Detection
Andrés Morales-Forero
Lili J. Rueda
Ronald Herrera
Samuel Bassetto
Eric Coatanea
62
0
0
10 Jul 2025
No Language Data Left Behind: A Comparative Study of CJK Language Datasets in the Hugging Face Ecosystem
Dasol Choi
Woomyoung Park
Youngsook Song
146
0
0
06 Jul 2025
Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks
Luke M. Guerdan
Devansh Saxena
Stevie Chancellor
Zhiwei Steven Wu
Kenneth Holstein
187
1
0
03 Jul 2025
A case for data valuation transparency via DValCards
Keziah Naggita
Julienne LaChance
TDI
360
0
0
29 Jun 2025
LAION-C: An Out-of-Distribution Benchmark for Web-Scale Vision Models
Fanfei Li
Thomas Klein
Wieland Brendel
Robert Geirhos
Roland S. Zimmermann
OODD
192
3
0
20 Jun 2025
A Common Pool of Privacy Problems: Legal and Technical Lessons from a Large-Scale Web-Scraped Machine Learning Dataset
Rachel Hong
Jevan Hutson
William Agnew
Imaad Huda
Tadayoshi Kohno
Jamie Morgenstern
AILaw
314
3
0
20 Jun 2025
Identifying and Investigating Global News Coverage of Critical Events Such as Disasters and Terrorist Attacks
International Conference on Web and Social Media (ICWSM), 2025
Erica Cai
Xi Chen
Reagan Grey Keeney
Ethan Zuckerman
Brendan O'Connor
Przemyslaw A. Grabowicz
122
1
0
15 Jun 2025
IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments
Florian Bordes
Q. Garrido
Justine T Kao
Adina Williams
Michael G. Rabbat
Emmanuel Dupoux
PINN
196
12
0
11 Jun 2025
Survey on the Evaluation of Generative Models in Music
ACM Computing Surveys (ACM Comput. Surv.), 2025
Alexander Lerch
Claire Arthur
Nick Bryan-Kinns
Corey Ford
Qianyi Sun
Ashvala Vinay
588
4
0
05 Jun 2025
Red Teaming AI Policy: A Taxonomy of Avoision and the EU AI Act
Conference on Fairness, Accountability and Transparency (FAccT), 2025
Rui-Jie Yew
Bill Marino
Suresh Venkatasubramanian
158
3
0
02 Jun 2025
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability
Genta Indra Winata
David Anugraha
Emmy Liu
Alham Fikri Aji
Shou-Yi Hung
...
Muhammad Farid Adilazuarda
En-Shiun Annie Lee
Ayu Purwarianti
Derry Wijaya
Monojit Choudhury
318
2
0
02 Jun 2025
AI Data Development: A Scorecard for the System Card Framework
Tadesse K. Bahiru
Haileleol Tibebu
Ioannis A. Kakadiaris
161
2
0
02 Jun 2025
Developing a Risk Identification Framework for Foundation Model Uses
David Piorkowski
Michael Hind
John T. Richards
Jacquelyn Martino
111
1
0
01 Jun 2025
Risks of AI-driven product development and strategies for their mitigation
Jan Göpfert
J. Weinand
Patrick Kuckertz
Noah Pflugradt
Jochen Linßen
207
1
0
28 May 2025
Machine Learning Models Have a Supply Chain Problem
Sarah Meiklejohn
Hayden Blauzvern
Mihai Maruseac
Spencer Schrock
Laurent Simon
Ilia Shumailov
197
2
0
28 May 2025
Detecting Cultural Differences in News Video Thumbnails via Computational Aesthetics
Marvin Limpijankit
John Kender
208
0
0
28 May 2025
MObyGaze: a film dataset of multimodal objectification densely annotated by experts
Julie Tores
Elisa Ancarani
L. Sassatelli
Hui-Yin Wu
Clement Bergman
...
F. Precioso
Thierry Devars
Magali Guaresi
Virginie Julliard
Sarah Lecossais
DiffM
VGen
150
1
0
28 May 2025
Can we Debias Social Stereotypes in AI-Generated Images? Examining Text-to-Image Outputs and User Perceptions
Saharsh Barve
Andy Mao
Jiayue Melissa Shi
Prerna Juneja
Koustuv Saha
213
0
0
27 May 2025
We Need to Measure Data Diversity in NLP -- Better and Broader
Dong Nguyen
Esther Ploeger
235
1
0
26 May 2025
MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs
Zaid Alyafeai
Maged S. Al-Shaibani
Bernard Ghanem
281
4
0
26 May 2025
Fairness-in-the-Workflow: How Machine Learning Practitioners at Big Tech Companies Approach Fairness in Recommender Systems
Jing Nathan Yan
Emma Harvey
Junxiong Wang
Jeffrey M. Rzeszotarski
Allison Koenecke
FaML
290
0
0
26 May 2025
TEDI: Trustworthy and Ethical Dataset Indicators to Analyze and Compare Dataset Documentation
Wiebke Hutiri
Mircea Cimpoi
M. Scheuerman
Victoria Matthews
Alice Xiang
305
0
0
23 May 2025
Multi-agent Systems for Misinformation Lifecycle : Detection, Correction And Source Identification
Aditya Gautam
LLMAG
171
1
0
23 May 2025
Optimizing Image Capture for Computer Vision-Powered Taxonomic Identification and Trait Recognition of Biodiversity Specimens
Methods in Ecology and Evolution (MEE), 2025
Alyson East
Elizabeth G. Campolongo
Luke Meyers
S M Rayeed
Samuel Stevens
...
Hilmar Lapp
Paula M. Mabee
Graham W. Taylor
Graham W. Taylor
Sydne Record
160
4
0
22 May 2025
Social Bias in Popular Question-Answering Benchmarks
Angelie Kraft
Judith Simon
Sonja Schimmler
366
3
0
21 May 2025
Fast, Not Fancy: Rethinking G2P with Rich Data and Rule-Based Models
Mahta Fetrat Qharabagh
Zahra Dehghanian
Hamid R. Rabiee
135
1
0
19 May 2025
Towards SFW sampling for diffusion models via external conditioning
Camilo Carvajal Reyes
J. Fontbona
Felipe A. Tobar
DiffM
257
1
0
12 May 2025
UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections
International Conference on Web and Social Media (ICWSM), 2025
Fatima Haouari
Carolina Scarton
Nicolò Faggiani
Nikolaos Nikolaidis
Bonka Kotseva
Ibrahim Abu Farha
Jens Linge
Kalina Bontcheva
261
0
0
08 May 2025
Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
Sungbok Shin
Hyeon Jeon
Sanghyun Hong
Niklas Elmqvist
1.2K
0
0
01 May 2025
Clustering Internet Memes Through Template Matching and Multi-Dimensional Similarity
International Conference on Web and Social Media (ICWSM), 2025
Tygo Bloem
Filip Ilievski
268
1
0
30 Apr 2025
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models
Mihai Nadas
Laura Diosan
Andrei Piscoran
Andreea Tomescu
VGen
320
1
0
29 Apr 2025
Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System
Muhammad Imam Luthfi Balaka
David Alexander
Qian Wang
Yue Gong
Adila Krisnadhi
Raul Castro Fernandez
LMTD
RALM
188
10
0
12 Apr 2025
Perils of Label Indeterminacy: A Case Study on Prediction of Neurological Recovery After Cardiac Arrest
Conference on Fairness, Accountability and Transparency (FAccT), 2025
Jakob Schoeffer
Maria De-Arteaga
Jonathan Elmer
923
2
0
05 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
544
7
0
01 Apr 2025
XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?
Computer Vision and Pattern Recognition (CVPR), 2025
Fengxiang Wang
Hongru Wang
Mingshuo Chen
Haiyan Zhao
Yulin Wang
...
L. Lan
Wenjing Yang
Jing Zhang
Zhiyuan Liu
Maosong Sun
296
23
0
31 Mar 2025
Are clinicians ethically obligated to disclose their use of medical machine learning systems to patients?
Journal of Medical Ethics (JME), 2024
Joshua Hatherley
254
3
0
31 Mar 2025
Previous
1
2
3
4
5
...
20
21
22
Next