v1v2v3v4v5v6v7v8 (latest)

Datasheets for Datasets

23 March 2018

Timnit Gebru

Jamie Morgenstern

Briana Vecchione

Jennifer Wortman Vaughan

Papers citing "Datasheets for Datasets"

50 / 1,069 papers shown

LeanDojo: Theorem Proving with Retrieval-Augmented Language ModelsNeural Information Processing Systems (NeurIPS), 2023

376

338

27 Jun 2023

Use case cards: a use case reporting framework inspired by the European AI ActEthics and Information Technology (EIT), 2023

Isabelle Hupont

David Fernández Llorca

S. Baldassarri

Emilia Gómez

154

23 Jun 2023

Critical-Reflective Human-AI Collaboration: Exploring Computational Tools for Art Historical Image Retrieval

Katrin Glinka

Claudia Muller-Birn

22 Jun 2023

Realistic Synthetic Financial Transactions for Anti-Money Laundering ModelsNeural Information Processing Systems (NeurIPS), 2023

Erik Altman

Jovan Blanuvsa

Luc von Niederhäusern

Béni Egressy

Andreea Anghel

Kubilay Atasu

346

22 Jun 2023

Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities

Finale Doshi-Velez

334

22 Jun 2023

VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolutionNeural Information Processing Systems (NeurIPS), 2023

Aleksandar Shtedritski

Hannah Rose Kirk

CoGe

377

21 Jun 2023

An Overview of Catastrophic AI Risks

600

247

21 Jun 2023

Benchmarking the Influence of Pre-training on Explanation Performance in MR Image Classification

180

21 Jun 2023

Event Stream GPT: A Data Pre-processing and Modeling Library for Generative, Pre-trained Transformers over Continuous-time Sequences of Complex EventsNeural Information Processing Systems (NeurIPS), 2023

Matthew B. A. McDermott

372

20 Jun 2023

Quilt-1M: One Million Image-Text Pairs for HistopathologyNeural Information Processing Systems (NeurIPS), 2023

Wisdom O. Ikezogwo

M. S. Seyfioglu

Fatemeh Ghezloo

Dylan Stefan Chan Geva

Fatwir Sheikh Mohammed

736

196

20 Jun 2023

CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity QuantificationIEEE Transactions on Big Data (IEEE Trans. Big Data), 2023

Le-le Cao

Vilhelm von Ehrenheim

Mark Granroth-Wilding

Richard Anselmo Stahl

Andrew McCornack

Armin Catovic

Dhiana Deva Cavalcanti Rocha

303

18 Jun 2023

The Importance of Human-Labeled Data in the Era of LLMsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Yang Liu

ALM

239

18 Jun 2023

Reproducibility in NLP: What Have We Learned from the Checklist?Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Ian H. Magnusson

Noah A. Smith

Jesse Dodge

170

16 Jun 2023

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound EventsNeural Information Processing Systems (NeurIPS), 2023

Kazuki Shimada

Archontis Politis

Parthasaarathy Sudarsanam

...

273

15 Jun 2023

Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality FusionInternational Conference on Machine Learning (ICML), 2023

Cheston Tan

276

15 Jun 2023

LargeST: A Benchmark Dataset for Large-Scale Traffic ForecastingNeural Information Processing Systems (NeurIPS), 2023

Xu Liu

Bryan Hooi

Roger Zimmermann

AI4TS

180

146

14 Jun 2023

V-LoL: A Diagnostic Dataset for Visual Logical Learning

302

13 Jun 2023

Unraveling the Interconnected Axes of Heterogeneity in Machine Learning for Democratic and Inclusive AdvancementsConference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO), 2023

Maryam Molamohammadi

Afaf Taik

Nicolas Le Roux

G. Farnadi

194

11 Jun 2023

Evaluating the Social Impact of Generative AI Systems in Systems and Society

...

486

150

09 Jun 2023

AircraftVerse: A Large-Scale Multimodal Dataset of Aerial Vehicle DesignsNeural Information Processing Systems (NeurIPS), 2023

...

176

08 Jun 2023

Explainable Predictive Maintenance

...

213

08 Jun 2023

MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of VideosComputer Vision and Pattern Recognition (CVPR), 2023

...

Ding Zhao

231

07 Jun 2023

Art and the science of generative AI: A deeper diveScience (Science), 2023

...

271

494

07 Jun 2023

Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models

Jose Berengueres

Marybeth Sandell

181

06 Jun 2023

AVIDa-hIL6: A Large-Scale VHH Dataset Produced from an Immunized Alpaca for Predicting Antigen-Antibody InteractionsNeural Information Processing Systems (NeurIPS), 2023

...

115

06 Jun 2023

AHA!: Facilitating AI Impact Assessment by Generating Examples of Harms

205

05 Jun 2023

NLPositionality: Characterizing Design Biases of Datasets and ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

317

106

02 Jun 2023

AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap

Q. V. Liao

J. Vaughan

318

222

02 Jun 2023

Multilingual Conceptual Coverage in Text-to-Image ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Michael Stephen Saxon

William Yang Wang

EGVM

140

02 Jun 2023

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Guilherme Penedo

Quentin Malartic

Daniel Hesslow

Ruxandra-Aimée Cojocaru

422

881

01 Jun 2023

Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?

167

28 May 2023

Optimization's Neglected Normative CommitmentsConference on Fairness, Accountability and Transparency (FAccT), 2023

218

27 May 2023

On Degrees of Freedom in Defining and Testing Natural Language UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Saku Sugawara

S. Tsugita

ELM

326

24 May 2023

TalkUp: Paving the Way for Understanding Empowering LanguageConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

254

23 May 2023

PaLM 2 Technical Report

...

678

1,406

17 May 2023

ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing

Hua Shen

Huang Chieh-Yang

Tongshuang Wu

Ting-Hao 'Kenneth' Huang

457

16 May 2023

It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of PerformanceAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

205

15 May 2023

DATED: Guidelines for Creating Synthetic Datasets for Engineering Design ApplicationsDesign Automation Conference (DAC), 2023

Cyril Picard

Jürg Schiffmann

Faez Ahmed

167

15 May 2023

PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in IndiaConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Ashok Urlana

185

15 May 2023

Certification Labels for Trustworthy AI: Insights From an Empirical Mixed-Method StudyConference on Fairness, Accountability and Transparency (FAccT), 2023

199

15 May 2023

What's the Meaning of Superhuman Performance in Today's NLU?Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Daniel Hershcovich

...

ELM LM&MA VLM ReLM LRM

309

15 May 2023

The Ethics of AI in GamesIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2023

Georgios N. Yannakakis

155

12 May 2023

Vārta: A Large-Scale Headline-Generation Dataset for Indic LanguagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

270

10 May 2023

When Do Neural Nets Outperform Boosted Trees on Tabular Data?Neural Information Processing Systems (NeurIPS), 2023

Ganesh Ramakrishnan

305

248

04 May 2023

AutoML-GPT: Automatic Machine Learning with GPT

317

04 May 2023

Judgment Sieve: Reducing Uncertainty in Group Judgments through Interventions Targeting Ambiguity versus Disagreement

Quan Ze Chen

Amy X. Zhang

199

02 May 2023

SoK: Log Based Transparency Enhancing Technologies

A. Hicks

200

02 May 2023

Racial Bias within Face Recognition: A SurveyACM Computing Surveys (ACM Comput. Surv.), 2023

Noura Al Moubayed

235

01 May 2023

Generating Process-Centric Explanations to Enable Contestability in Algorithmic Decision-Making: Challenges and Opportunities

Mireia Yurrita

Agathe Balayn

U. Gadiraju

183

01 May 2023

Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

591

163

28 Apr 2023