v1v2v3v4v5v6v7v8 (latest)

Datasheets for Datasets

23 March 2018

Timnit Gebru

Jamie Morgenstern

Briana Vecchione

Jennifer Wortman Vaughan

Papers citing "Datasheets for Datasets"

50 / 1,069 papers shown

Societal Adaptation to Advanced AI

435

16 May 2024

Risks and Opportunities of Open-Source Generative AI

...

425

14 May 2024

BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

201

10 May 2024

Automatic Generation of Model and Data Cards: A Step Towards Responsible AINorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

299

10 May 2024

The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human LabelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

210

09 May 2024

Natural Language Processing RELIES on LinguisticsComputational Linguistics (CL), 2024

629

09 May 2024

Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames

Keith Burghardt

Kai Chen

Kristina Lerman

181

06 May 2024

Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows

Jasmine Y. Shih

Vishal Mohanty

Yannis Katsis

Hariharan Subramonyam

191

03 May 2024

Social Life Simulation for Non-Cognitive Skills Learning

Zihan Yan

Yaohong Xiang

Yun Huang

215

01 May 2024

Towards Scenario- and Capability-Driven Dataset Development and Evaluation: An Approach in the Context of Mapless Automated Driving

Felix Grün

Marcus Nolte

Markus Maurer

302

30 Apr 2024

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

...

Loic Landrieu

208

29 Apr 2024

Benchmarking Benchmark Leakage in Large Language Models

253

29 Apr 2024

Mapping the Potential of Explainable AI for Fairness Along the AI Lifecycle

Niklas Kühl

409

29 Apr 2024

Lazy Data Practices Harm Fairness Research

Jan Simson

Alessandro Fabris

Christoph Kern

198

26 Apr 2024

Near to Mid-term Risks and Opportunities of Open-Source Generative AI

Francisco Eiras

Aleksandar Petrov

Bertie Vidgen

Christian Schroeder de Witt

Fabio Pizzati

...

Paul Röttger

291

25 Apr 2024

Inside the echo chamber: Linguistic underpinnings of misinformation on Twitter

Xinyu Wang

Jiayi Li

Sarah Rajtmajer

24 Apr 2024

Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing

Ben Hutchinson

294

23 Apr 2024

Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?

297

19 Apr 2024

AI Competitions and Benchmarks: Dataset Development

Romain Egele

Julio C. S. Jacques Junior

Jan N. van Rijn

173

15 Apr 2024

Laissez-Faire Harms: Algorithmic Biases in Generative Language Models

207

11 Apr 2024

Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMsAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2024

Ahmed A. Agiza

Mohamed Mostagir

Sherief Reda

186

10 Apr 2024

Racial/Ethnic Categories in AI and Algorithmic Fairness: Why They Matter and What They RepresentConference on Fairness, Accountability and Transparency (FAccT), 2024

Jennifer Mickel

127

10 Apr 2024

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

319

09 Apr 2024

Data Readiness for AI: A 360-Degree Survey

Kaveen Hiniduma

Suren Byna

J. L. Bez

186

08 Apr 2024

Concept -- An Evaluation Protocol on Conversational Recommender Systems with System-centric and User-centric Factors

410

04 Apr 2024

Responsible Reporting for Frontier AI DevelopmentAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2024

Gillian K. Hadfield

268

03 Apr 2024

Will the Real Linda Please Stand up...to Large Language Models? Examining the Representativeness Heuristic in LLMs

292

01 Apr 2024

Designing a User-centric Framework for Information Quality Ranking of Large-scale Street View Images

Tahiya Chowdhury

Ilan Mandel

Jorge Ortiz

Wendy Ju

130

30 Mar 2024

FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures

Lena Maier-Hein

151

29 Mar 2024

Benchmarking Object Detectors with COCO: A New Path Forward

170

27 Mar 2024

Decoding the Digital Fine Print: Navigating the potholes in Terms of service/ use of GenAI tools against the emerging need for Transparent and Trustworthy Tech Futures

Sundaraparipurnan Narayanan

135

26 Mar 2024

Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships

Rangel Daroya

Aaron Sun

Subhransu Maji

348

25 Mar 2024

Reflecting the Male Gaze: Quantifying Female Objectification in 19th and 20th Century Novels

190

25 Mar 2024

"We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning

Shreya Shankar

Rolando Garcia

J. M. Hellerstein

Aditya G. Parameswaran

242

25 Mar 2024

InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content DetectionInternational Conference on Web and Social Media (ICWSM), 2024

211

22 Mar 2024

Dated Data: Tracing Knowledge Cutoffs in Large Language Models

Daniel Khashabi

Benjamin Van Durme

283

19 Mar 2024

From Melting Pots to Misrepresentations: Exploring Harms in Generative AI

Sanjana Gautam

Pranav Narayanan Venkit

Sourojit Ghosh

185

16 Mar 2024

Data Ethics Emergency Drill: A Toolbox for Discussing Responsible AI for Industry Teams

226

15 Mar 2024

Couler: Unified Machine Learning Workflow Optimization in CloudIEEE International Conference on Data Engineering (ICDE), 2024

170

12 Mar 2024

Elephants Never Forget: Testing Language Models for Memorization of Tabular Data

229

11 Mar 2024

CommitBench: A Benchmark for Commit Message GenerationIEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2024

Maximilian Schall

Tamara Czinczoll

Gerard de Melo

191

08 Mar 2024

Position: Insights from Survey Methodology can Improve Training Data

310

02 Mar 2024

Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into its near-SoTA performance

244

01 Mar 2024

Implications of Regulations on the Use of AI and Generative AI for Human-Centered Responsible Artificial Intelligence

Marios Constantinides

...

Ilana Golbin Blumenfeld

Giada Pistilli

164

29 Feb 2024

The Situate AI Guidebook: Co-Designing a Toolkit to Support Multi-Stakeholder Early-stage Deliberations Around Public Sector AI Proposals

Haiyi Zhu

233

29 Feb 2024

DANSK and DaCy 2.6.0: Domain Generalization of Danish Named Entity Recognition

Kenneth Enevoldsen

Fredrik Jørgensen

Morten H Baglini

202

28 Feb 2024

An Integrated Data Processing Framework for Pretraining Foundation Models

275

26 Feb 2024

Foundation Model Transparency Reports

256

26 Feb 2024

Towards Fair Graph Anomaly Detection: Problem, New Datasets, and Evaluation

298

25 Feb 2024

Farsight: Fostering Responsible AI Awareness During AI Application Prototyping

317

23 Feb 2024