Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1803.09010
Cited By
v1
v2
v3
v4
v5
v6
v7
v8 (latest)
Datasheets for Datasets
23 March 2018
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Datasheets for Datasets"
50 / 1,069 papers shown
Title
Eval Factsheets: A Structured Framework for Documenting AI Evaluations
Florian Bordes
Candace Ross
Justine T Kao
Evangelia Spiliopoulou
Adina Williams
36
0
0
03 Dec 2025
Whose Personae? Synthetic Persona Experiments in LLM Research and Pathways to Transparency
Jan Batzner
Volker Stocker
Bingjun Tang
Anusha Natarajan
Qinhao Chen
Stefan Schmid
Gjergji Kasneci
71
1
0
29 Nov 2025
Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations
Ryan Wong
Hosea David Yu Fei Ng
Dhananjai Sharma
Glenn Jun Jie Ng
Kavishvaran Srinivasan
AAML
273
0
0
24 Nov 2025
AI Bill of Materials and Beyond: Systematizing Security Assurance through the AI Risk Scanning (AIRS) Framework
Samuel Nathanson
Alexander Lee
Catherine Chen Kieffer
Jared Junkin
Jessica Ye
Amir Saeed
Melanie Lockhart
Russ Fink
Elisha Peterson
Lanier Watkins
56
0
0
16 Nov 2025
mmJEE-Eval: A Bilingual Multimodal Benchmark for Evaluating Scientific Reasoning in Vision-Language Models
Arka Mukherjee
Shreya Ghosh
LRM
164
0
0
12 Nov 2025
InfoAffect: A Dataset for Affective Analysis of Infographics
Zihang Fu
Yunchao Wang
Chenyu Huang
Guodao Sun
Ronghua Liang
108
0
0
09 Nov 2025
QuAnTS: Question Answering on Time Series
Felix Divo
Maurice Kraus
Anh Q. Nguyen
Hao Xue
Imran Razzak
Flora D. Salim
Kristian Kersting
Devendra Singh Dhami
88
0
0
07 Nov 2025
Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Anka Reuel
Avijit Ghosh
Jenny Chim
Andrew Tran
Yanan Long
...
Zeerak Talat
Stella Biderman
Mykel J. Kochenderfer
Sanmi Koyejo
Irene Solaiman
ELM
226
0
0
06 Nov 2025
What's in Common? Multimodal Models Hallucinate When Reasoning Across Scenes
Candace Ross
Florian Bordes
Adina Williams
Polina Kirichenko
Mark Ibrahim
VLM
ReLM
LRM
208
1
0
05 Nov 2025
EvalCards: A Framework for Standardized Evaluation Reporting
Ruchira Dhar
Danae Sanchez Villegas
Antonia Karamolegkou
Alice Schiavone
Yifei Yuan
...
Monorama Swain
Stephanie Brandl
Daniel Hershcovich
Anders Søgaard
Desmond Elliott
56
0
0
05 Nov 2025
miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward
Azim Ospanov
Farzan Farnia
Roozbeh Yousefzadeh
105
1
0
05 Nov 2025
AyurParam: A State-of-the-Art Bilingual Language Model for Ayurveda
Mohd Nauman
Sravan Gvm
Vijay Devane
Shyam Pawar
Viraj Thakur
Kundeshwar Pundalik
Piyush Sawarkar
Rohit Saluja
Maunendra Sankar Desarkar
Ganesh Ramakrishnan
LM&MA
ELM
252
0
0
04 Nov 2025
Measuring what Matters: Construct Validity in Large Language Model Benchmarks
Andrew M. Bean
Ryan Kearns
Angelika Romanou
Franziska Sofia Hafner
Harry Mayne
...
Christopher Summerfield
Philip Torr
Cozmin Ududec
Luc Rocher
Adam Mahdi
ALM
466
5
0
03 Nov 2025
Before the Clinic: Transparent and Operable Design Principles for Healthcare AI
Alexander Bakumenko
Aaron J. Masino
Janine Hoelscher
92
1
0
31 Oct 2025
A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection
Gaku Morio
Harri Rowlands
Dominik Stammbach
Christopher D. Manning
Peter Henderson
101
0
0
24 Oct 2025
HIKMA: Human-Inspired Knowledge by Machine Agents through a Multi-Agent Framework for Semi-Autonomous Scientific Conferences
Zain Ul Abideen Tariq
Mahmood Al-Zubaidi
Uzair Shah
Marco Agus
Mowafa J Househ
112
0
0
24 Oct 2025
Race and Gender in LLM-Generated Personas: A Large-Scale Audit of 41 Occupations
Ilona van der Linden
Sahana Kumar
Arnav Dixit
Aadi Sudan
Smruthi Danda
David C. Anastasiu
Kai Lukoff
104
0
0
23 Oct 2025
VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety
Shruti Palaskar
Leon A Gatys
Mona Abdelrahman
Mar Jacobo
Larry Lindsey
...
Yang Xu
Navid Shiee
Jeffrey Bigham
Charles Maalouf
Joseph Y Cheng
122
0
0
21 Oct 2025
BO4Mob: Bayesian Optimization Benchmarks for High-Dimensional Urban Mobility Problem
Seunghee Ryu
Donghoon Kwon
Seongjin Choi
Aryan Deshwal
Seungmo Kang
Carolina Osorio
96
0
0
21 Oct 2025
Evaluating Medical LLMs by Levels of Autonomy: A Survey Moving from Benchmarks to Applications
Xiao Ye
Jacob Dineen
Zhaonan Li
Zhikun Xu
Weiyu Chen
...
Ji-Eun Irene Yum
Muhammad Ali Khan
Muhammad Umar Afzal
Irbaz B. Riaz
Ben Zhou
LM&MA
ELM
166
1
0
20 Oct 2025
AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages
Mardiyyah Oduwole
Prince Mireku
Fatimo Adebanjo
Oluwatosin Olajide
Mahi Aminu Aliyu
Jekaterina Novikova
93
0
0
20 Oct 2025
DroneAudioset: An Audio Dataset for Drone-based Search and Rescue
Chitralekha Gupta
Soundarya Ramesh
P. Sasikumar
Kian Peen Yeo
Suranga Nanayakkara
90
0
0
17 Oct 2025
Iterative Topic Taxonomy Induction with LLMs: A Case Study of Electoral Advertising
Alexander Brady
Tunazzina Islam
72
0
0
16 Oct 2025
JEDA: Query-Free Clinical Order Search from Ambient Dialogues
Praphul Singh
Corey D Barrett
Sumana Srivasta
Amitabh Saikia
Irfan Bulu
Sri Gadde
Krishnaram Kenthapadi
118
0
0
16 Oct 2025
Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Counts in the Global Terrorism Database (GTD)
Oluwasegun Adegoke
AI4TS
104
0
0
16 Oct 2025
Machine Learning and Public Health: Identifying and Mitigating Algorithmic Bias through a Systematic Review
Sara Altamirano
Arjan Vreeken
Sennay Ghebreab
115
0
0
16 Oct 2025
The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models
Lukas Gienapp
Christopher Schröder
Stefan Schweter
Christopher Akiki
Ferdinand Schlatt
Arden Zimmermann
Phillipe Genêt
Martin Potthast
VLM
88
0
0
15 Oct 2025
Measuring What Matters: The AI Pluralism Index
Rashid Mushkani
52
0
0
09 Oct 2025
Lean Finder: Semantic Search for Mathlib That Understands User Intents
Jialin Lu
Kye Emond
Kaiyu Yang
Swarat Chaudhuri
Weiran Sun
Wuyang Chen
138
2
0
08 Oct 2025
COLE: a Comprehensive Benchmark for French Language Understanding Evaluation
David Beauchemin
Yan Tremblay
Mohamed Amine Youssef
Richard Khoury
ELM
248
1
0
06 Oct 2025
Accountability Capture: How Record-Keeping to Support AI Transparency and Accountability (Re)shapes Algorithmic Oversight
Shreya Chappidi
Jennifer Cobbe
Chris Norval
A. Mazumder
Jatinder Singh
96
1
0
06 Oct 2025
What is a protest anyway? Codebook conceptualization is still a first-order concern in LLM-era classification
Andrew Halterman
Katherine A. Keith
126
0
0
03 Oct 2025
Facilitating Cognitive Accessibility with LLMs: A Multi-Task Approach to Easy-to-Read Text Generation
François Ledoyen
Gaël Dias
Jeremie Pantin
Alexis Lechervy
Fabrice Maurel
Youssef Chahir
88
0
0
01 Oct 2025
On Explaining Proxy Discrimination and Unfairness in Individual Decisions Made by AI Systems
Belona Sonna
Alban Grastien
116
0
0
30 Sep 2025
RoBiologyDataChoiceQA: A Romanian Dataset for improving Biology understanding of Large Language Models
Dragos-Dumitru Ghinea
Adela-Nicoleta Corbeanu
Adrian-Marius Dumitran
96
0
0
30 Sep 2025
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes
Paul Gavrikov
Wei Lin
Muhammad Jehanzeb Mirza
Soumya Jahagirdar
Muhammad Huzaifa
Sivan Doveh
Serena Yeung-Levy
James R. Glass
Hilde Kuehne
CoGe
179
1
0
29 Sep 2025
Fostering Robots: A Governance-First Conceptual Framework for Domestic, Curriculum-Based Trajectory Collection
Federico Pablo-Marti
Carlos Mir Fernandez
48
0
0
28 Sep 2025
Does AI Coaching Prepare us for Workplace Negotiations?
Veda Duddu
Jash Rajesh Parekh
Andy Mao
Hanyi Min
Ziang Xiao
V. D. Swain
Koustuv Saha
109
0
0
26 Sep 2025
WolBanking77: Wolof Banking Speech Intent Classification Dataset
Abdou Karim Kandji
Frédéric Precioso
Cheikh Ba
Samba Ndiaye
Augustin Ndione
201
0
0
23 Sep 2025
QUINTA: Reflexive Sensibility For Responsible AI Research and Data-Driven Processes
Alicia E. Boyd
40
1
0
19 Sep 2025
Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models
Sreejato Chatterjee
Linh Tran
Quoc Duy Nguyen
Roni Kirson
Drue Hamlin
Harvest Aquino
Hanjia Lyu
Jiebo Luo
Timothy Dye
72
0
0
18 Sep 2025
Practitioners' Perspectives on a Differential Privacy Deployment Registry
Priyanka Nanayakkara
Elena Ghazi
Salil Vadhan
119
1
0
16 Sep 2025
Op-Fed: Opinion, Stance, and Monetary Policy Annotations on FOMC Transcripts Using Active Learning
Alisa Kanganis
Katherine A. Keith
117
0
0
16 Sep 2025
Standards in the Preparation of Biomedical Research Metadata: A Bridge2AI Perspective
Harry Caufield
Satrajit Ghosh
Sek Wong Kong
Jillian Parker
Nathan Sheffield
Bhavesh Patel
Andrew Williams
Timothy Clark
Monica C. Munoz-Torres
115
0
0
12 Sep 2025
MetaRAG: Metamorphic Testing for Hallucination Detection in RAG Systems
Channdeth Sok
David Luz
Yacine Haddam
HILM
294
0
0
11 Sep 2025
Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning
Michele Joshua Maggini
Dhia Merzougui
Rabiraj Bandyopadhyay
Gaël Dias
Fabrice Maurel
Pablo Gamallo
108
0
0
09 Sep 2025
MatPROV: A Provenance Graph Dataset of Material Synthesis Extracted from Scientific Literature
Hirofumi Tsuruta
Masaya Kumagai
157
0
0
01 Sep 2025
Who Owns The Robot?: Four Ethical and Socio-technical Questions about Wellbeing Robots in the Real World through Community Engagement
Minja Axelsson
Jiaee Cheong
Rune Nyrup
Hatice Gunes
150
2
0
01 Sep 2025
Deep opacity and AI: A threat to XAI and to privacy protection mechanisms
Vincent C. Müller
52
0
0
30 Aug 2025
Mapping Toxic Comments Across Demographics: A Dataset from German Public Broadcasting
Jan Fillies
Michael Peter Hoffmann
Rebecca Reichel
Roman Salzwedel
Sven Bodemer
Adrian Paschke
96
0
0
26 Aug 2025
1
2
3
4
...
20
21
22
Next