v1v2v3v4v5v6v7v8 (latest)

Datasheets for Datasets

23 March 2018

Timnit Gebru

Jamie Morgenstern

Briana Vecchione

Jennifer Wortman Vaughan

Papers citing "Datasheets for Datasets"

50 / 1,069 papers shown

Exploring Data Pipelines through the Process Lens: a Reference Model forComputer Vision

Agathe Balayn

B. Kulynych

S. Guerses

168

05 Jul 2021

Ethics Sheets for AI Tasks

Saif M. Mohammad

306

02 Jul 2021

An Information Retrieval Approach to Building Datasets for Hate Speech Detection

Md. Mustafizur Rahman

Dinesh Balakrishnan

Dhiraj Murthy

Mucahid Kutlu

Matthew Lease

305

17 Jun 2021

Modeling Worlds in Text

Prithviraj Ammanabrolu

Mark O. Riedl

VGen LM&Ro

140

17 Jun 2021

Understanding and Evaluating Racial Biases in Image Captioning

Dora Zhao

Angelina Wang

Olga Russakovsky

287

159

16 Jun 2021

Physion: Evaluating Physical Prediction from Vision in Humans and Machines

...

Li Fei-Fei

Nancy Kanwisher

538

116

15 Jun 2021

CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

Ningyu Zhang

Mosha Chen

Zhen Bi

Xiaozhuan Liang

Lei Li

...

437

220

15 Jun 2021

Simon Says: Evaluating and Mitigating Bias in Pruned Neural Networks with Knowledge Distillation

Cody Blakeney

Nathaniel Huish

Yan Yan

Ziliang Zong

119

15 Jun 2021

A Discussion on Building Practical NLP Leaderboards: The Case of Machine Translation

Sebastin Santy

Prasanta Bhattacharya

LLMAG

281

11 Jun 2021

Hard Choices in Artificial IntelligenceArtificial Intelligence (AI), 2021

Roel Dobbe

T. Gilbert

Yonatan Dov Mintz

151

10 Jun 2021

BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling

Andrea Madotto

255

05 Jun 2021

MERLOT: Multimodal Neural Script Knowledge ModelsNeural Information Processing Systems (NeurIPS), 2021

Yejin Choi

348

428

04 Jun 2021

Annotation Curricula to Implicitly Train Non-Expert AnnotatorsComputational Linguistics (CL), 2021

Ji-Ung Lee

Jan-Christoph Klie

Iryna Gurevych

210

04 Jun 2021

The Contestation of Tech Ethics: A Sociotechnical Approach to Technology Ethics in PracticeJournal of Social Computing (JSC), 2021

Benson K. Green

AILaw

104

03 Jun 2021

Know Your Model (KYM): Increasing Trust in AI and Machine Learning

190

31 May 2021

Changing the World by Changing the DataAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Anna Rogers

224

28 May 2021

Towards Knowledge Organization Ecosystems

Mayukh Bagchi

23 May 2021

Measuring Coding Challenge Competence With APPS

...

1.2K

910

20 May 2021

KLUE: Korean Language Understanding Evaluation

...

469

220

20 May 2021

Conversational AI Systems for Social Good: Opportunities and Challenges

235

13 May 2021

Feature Interactions on Steroids: On the Composition of ML Models

Jane Hsieh

Eunsuk Kang

S. Apel

142

13 May 2021

Providing Assurance and Scrutability on Shared Data and Machine Learning Models with Verifiable CredentialsConcurrency and Computation (CCPE), 2021

157

13 May 2021

Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus

Jack Bandy

Nicholas Vincent

147

11 May 2021

e-ViL: A Dataset and Benchmark for Natural Language Explanations in Vision-Language TasksIEEE International Conference on Computer Vision (ICCV), 2021

347

108

08 May 2021

What's in the Box? A Preliminary Analysis of Undesirable Content in the Common Crawl CorpusAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

A. Luccioni

J. Viviano

365

136

06 May 2021

Reliability Testing for Natural Language Processing SystemsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

323

06 May 2021

An Examination of Fairness of AI Models for Deepfake DetectionInternational Joint Conference on Artificial Intelligence (IJCAI), 2021

Loc Trinh

Wenshu Fan

CVBM

214

02 May 2021

SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation

Robin Shing Moon Chan

Roland Siegwart

260

167

30 Apr 2021

Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled CorpusConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Dirk Groeneveld

313

562

18 Apr 2021

Frequency-based Distortions in Contextualized Word Embeddings

Kaitlyn Zhou

Kawin Ethayarajh

Dan Jurafsky

140

17 Apr 2021

Concadia: Towards Image-Based Text Generation with a PurposeConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

227

16 Apr 2021

Semantic maps and metrics for science Semantic maps and metrics for science using deep transformer encoders

Brendan Chambers

James A. Evans

MedIm

175

13 Apr 2021

XFORMAL: A Benchmark for Multilingual Formality Style TransferNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

Eleftheria Briakou

Di Lu

Ke Zhang

Joel R. Tetreault

185

08 Apr 2021

ORBIT: A Real-World Few-Shot Dataset for Teachable Object RecognitionIEEE International Conference on Computer Vision (ICCV), 2021

Matthew Tobias Harris

457

08 Apr 2021

Question-Driven Design Process for Explainable AI User Experiences

343

08 Apr 2021

The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

...

441

06 Apr 2021

AI4D -- African Language Program

Vukosi Marivate

...

142

06 Apr 2021

What Will it Take to Fix Benchmarking in Natural Language Understanding?North American Chapter of the Association for Computational Linguistics (NAACL), 2021

Samuel R. Bowman

George E. Dahl

ELM ALM

271

188

05 Apr 2021

Visual Semantic Role Labeling for Video UnderstandingComputer Vision and Pattern Recognition (CVPR), 2021

290

02 Apr 2021

Towards An Ethics-Audit Bot

Siani Pearson

Martin Lloyd

Vivek Nallur

29 Mar 2021

Automation: An Essential Component Of Ethical AI?

Vivek Nallur

Martin Lloyd

Siani Pearson

29 Mar 2021

A Multistakeholder Approach Towards Evaluating AI Transparency Mechanisms

Ankur Taly

Maarten de Rijke

123

27 Mar 2021

Characterizing and Detecting Mismatch in Machine-Learning-Enabled SystemsWorkshop on AI Engineering - Software Engineering for AI (ESEA), 2021

Grace A. Lewis

S. Bellomo

Ipek Ozkaya

134

25 Mar 2021

Quality at a Glance: An Audit of Web-Crawled Multilingual DatasetsTransactions of the Association for Computational Linguistics (TACL), 2021

...

426

310

22 Mar 2021

#PraCegoVer: A Large Dataset for Image Captioning in PortugueseInternational Conference on Data Technologies and Applications (DATA), 2021

G. O. D. Santos

Esther Luna Colombini

Sandra Avila

208

21 Mar 2021

The Human Evaluation Datasheet 1.0: A Template for Recording Details of Human Evaluation Experiments in NLP

Anastasia Shimorina

Anya Belz

138

17 Mar 2021

Preregistering NLP ResearchNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

217

11 Mar 2021

Designing Disaggregated Evaluations of AI Systems: Choices, Considerations, and TradeoffsAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2021

Meredith Ringel Morris

Jennifer Wortman Vaughan

Duncan Wadsworth

Hanna M. Wallach

163

10 Mar 2021

Rissanen Data Analysis: Examining Dataset Characteristics via Description LengthInternational Conference on Machine Learning (ICML), 2021

Ethan Perez

Douwe Kiela

Dong Wang

202

05 Mar 2021

A framework for fostering transparency in shared artificial intelligence models by increasing visibility of contributionsConcurrency and Computation (CCPE), 2020

105

05 Mar 2021