Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.09010
Cited By
Datasheets for Datasets
23 March 2018
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Datasheets for Datasets"
50 / 966 papers shown
Title
Metadata Representations for Queryable ML Model Zoos
Ziyu Li
Rihan Hai
A. Bozzon
Asterios Katsifodimos
6
2
0
19 Jul 2022
PiC: A Phrase-in-Context Dataset for Phrase Understanding and Semantic Search
Thang M. Pham
Seunghyun Yoon
Trung Bui
Anh Totti Nguyen
19
7
0
19 Jul 2022
Data Representativeness in Accessibility Datasets: A Meta-Analysis
Rie Kamikubo
Lining Wang
Crystal Marte
Amnah Mahmood
Hernisa Kacorri
30
18
0
16 Jul 2022
More Data Can Lead Us Astray: Active Data Acquisition in the Presence of Label Bias
Yunyi Li
Maria De-Arteaga
M. Saar-Tsechansky
FaML
19
3
0
15 Jul 2022
Leakage and the Reproducibility Crisis in ML-based Science
Sayash Kapoor
Arvind Narayanan
25
177
0
14 Jul 2022
Open High-Resolution Satellite Imagery: The WorldStrat Dataset -- With Application to Super-Resolution
Julien Cornebise
Ivan Orsolic
F. Kalaitzis
13
54
0
13 Jul 2022
Human-Centric Research for NLP: Towards a Definition and Guiding Questions
Bhushan Kotnis
Kiril Gashteovski
J. Gastinger
G. Serra
Francesco Alesiani
T. Sztyler
Ammar Shaker
Na Gong
Carolin (Haas) Lawrence
Zhao Xu
23
9
0
10 Jul 2022
The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications
Mirac Suzgun
Luke Melas-Kyriazi
Suproteem K. Sarkar
S. Kominers
Stuart M. Shieber
38
24
0
08 Jul 2022
VeriDark: A Large-Scale Benchmark for Authorship Verification on the Dark Web
Andrei Manolache
Florin Brad
Antonio Bărbălău
Radu Tudor Ionescu
Marius Popescu
27
19
0
07 Jul 2022
Fairness and Bias in Robot Learning
Laura Londoño
Juana Valeria Hurtado
Nora Hertz
P. Kellmeyer
S. Voeneky
Abhinav Valada
FaML
21
9
0
07 Jul 2022
SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
A. Białecki
N. Jakubowska
P. Dobrowolski
P. Białecki
Leszek Krupiñski
Andrzej Szczap
R. Białecki
Jan Gajewski
16
10
0
07 Jul 2022
Towards Transparency in Dermatology Image Datasets with Skin Tone Annotations by Experts, Crowds, and an Algorithm
Matthew Groh
Caleb Harris
Roxana Daneshjou
Omar Badri
A. Koochek
9
40
0
06 Jul 2022
Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users
Ana Lucic
Sheeraz Ahmad
Amanda Furtado Brinhosa
Q. V. Liao
Himani Agrawal
Umang Bhatt
K. Kenthapadi
Alice Xiang
Maarten de Rijke
N. Drabowski
10
2
0
06 Jul 2022
A domain-specific language for describing machine learning datasets
Joan Giner-Miguelez
Abel Gómez
Jordi Cabot
ALM
11
25
0
05 Jul 2022
Identifying the Context Shift between Test Benchmarks and Production Data
Matthew Groh
OOD
10
8
0
03 Jul 2022
Shifts 2.0: Extending The Dataset of Real Distributional Shifts
A. Malinin
A. Athanasopoulos
M. Barakovic
Meritxell Bach Cuadra
Mark J. F. Gales
...
Francesco La Rosa
Eli Sivena
V. Tsarsitalidis
Efi Tsompopoulou
E. Volf
OOD
22
28
0
30 Jun 2022
A Validity Perspective on Evaluating the Justified Use of Data-driven Decision-making Algorithms
Amanda Coston
Anna Kawakami
Haiyi Zhu
Kenneth Holstein
Hoda Heidari
42
34
0
30 Jun 2022
Distilling Model Failures as Directions in Latent Space
Saachi Jain
Hannah Lawrence
Ankur Moitra
A. Madry
18
89
0
29 Jun 2022
Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models
Yang Trista Cao
Anna Sotnikova
Hal Daumé
Rachel Rudinger
L. Zou
15
46
0
23 Jun 2022
The ArtBench Dataset: Benchmarking Generative Models with Artworks
Peiyuan Liao
Xiuyu Li
Xihui Liu
Kurt Keutzer
17
47
0
22 Jun 2022
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Sebastian Gehrmann
Abhik Bhattacharjee
Abinaya Mahendiran
Alex Jinpeng Wang
Alexandros Papangelis
...
Yacine Jernite
Yi Xu
Yisi Sang
Yixin Liu
Yufang Hou
47
38
0
22 Jun 2022
Then and Now: Quantifying the Longitudinal Validity of Self-Disclosed Depression Diagnoses
Keith Harrigian
Mark Dredze
17
3
0
22 Jun 2022
Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities
Zejiang Shen
Kyle Lo
L. Yu
N. Dahlberg
Margo Schlanger
Doug Downey
ELM
AILaw
29
43
0
22 Jun 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
107
1,061
0
22 Jun 2022
Bi-Calibration Networks for Weakly-Supervised Video Representation Learning
Fuchen Long
Ting Yao
Zhaofan Qiu
Xinmei Tian
Jiebo Luo
Tao Mei
30
6
0
21 Jun 2022
Interactive Visual Reasoning under Uncertainty
Manjie Xu
Guangyuan Jiang
Wei Liang
Song-Chun Zhu
Yixin Zhu
LRM
47
5
0
18 Jun 2022
Gender Artifacts in Visual Datasets
Nicole Meister
Dora Zhao
Angelina Wang
V. V. Ramaswamy
Ruth C. Fong
Olga Russakovsky
27
28
0
18 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
42
348
0
17 Jun 2022
Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment
Daniel Vera Nieto
Luigi Celona
Clara Fernandez-Labrador
CoGe
36
11
0
17 Jun 2022
All the World's a (Hyper)Graph: A Data Drama
Corinna Coupette
Jilles Vreeken
Bastian Alexander Rieck
9
2
0
16 Jun 2022
Beyond Adult and COMPAS: Fairness in Multi-Class Prediction
Wael Alghamdi
Hsiang Hsu
Haewon Jeong
Hao Wang
P. Michalák
S. Asoodeh
Flavio du Pin Calmon
FaML
27
16
0
15 Jun 2022
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across Modalities
Hammad A. Ayyubi
Christopher Thomas
Lovish Chum
R. Lokesh
Long Chen
...
Xudong Lin
Xuande Feng
Jaywon Koo
Sounak Ray
Shih-Fu Chang
AI4TS
23
0
0
14 Jun 2022
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
Matt Deitke
Eli VanderBilt
Alvaro Herrasti
Luca Weihs
Jordi Salvador
...
Winson Han
Eric Kolve
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
33
235
0
14 Jun 2022
Look, Radiate, and Learn: Self-Supervised Localisation via Radio-Visual Correspondence
Mohammed Alloulah
Maximilian Arnold
SSL
21
2
0
13 Jun 2022
Don't "research fast and break things": On the ethics of Computational Social Science
David Leslie
9
4
0
12 Jun 2022
Smallset Timelines: A Visual Representation of Data Preprocessing Decisions
L. R. Lucchesi
Petra Kuhnert
Jenny L. Davis
Lexing Xie
14
10
0
10 Jun 2022
CrowdWorkSheets: Accounting for Individual and Collective Identities Underlying Crowdsourced Dataset Annotation
Mark Díaz
Ian D Kivlichan
Rachel Rosen
Dylan K. Baker
Razvan Amironesei
Vinodkumar Prabhakaran
Emily L. Denton
14
82
0
09 Jun 2022
XAudit : A Theoretical Look at Auditing with Explanations
Chhavi Yadav
Michal Moshkovitz
Kamalika Chaudhuri
XAI
FAtt
MLAU
27
3
0
09 Jun 2022
SCAMPS: Synthetics for Camera Measurement of Physiological Signals
Daniel J. McDuff
Miah Wander
Xin Liu
B. Hill
Javier Hernández
Jonathan Lester
T. Baltrušaitis
17
39
0
08 Jun 2022
Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm
Aidan Good
Jia-Huei Lin
Hannah Sieg
Mikey Ferguson
Xin Yu
Shandian Zhe
J. Wieczorek
Thiago Serra
23
11
0
07 Jun 2022
Saliency Cards: A Framework to Characterize and Compare Saliency Methods
Angie Boggust
Harini Suresh
Hendrik Strobelt
John Guttag
Arvindmani Satyanarayan
FAtt
XAI
30
8
0
07 Jun 2022
Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata
A. Heger
Elizabeth B. Marquis
Mihaela Vorvoreanu
Hanna M. Wallach
J. W. Vaughan
8
60
0
06 Jun 2022
The Algorithmic Imprint
Upol Ehsan
Ranjit Singh
Jacob Metcalf
Mark O. Riedl
FaML
26
31
0
03 Jun 2022
All That's Happening behind the Scenes: Putting the Spotlight on Volunteer Moderator Labor in Reddit
Hanlin Li
Brent J. Hecht
Stevie Chancellor
19
38
0
28 May 2022
Empathic Conversations: A Multi-level Dataset of Contextualized Conversations
Damilola Omitaomu
Shabnam Tafreshi
Tingting Liu
Sven Buechel
Chris Callison-Burch
J. Eichstaedt
Lyle Ungar
João Sedoc
41
48
0
25 May 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
55
5,774
0
23 May 2022
Heroes, Villains, and Victims, and GPT-3: Automated Extraction of Character Roles Without Training Data
Dominik Stammbach
Maria Antoniak
Elliott Ash
148
32
0
16 May 2022
Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits
Wesley Hanwen Deng
Manish Nagireddy
M. S. Lee
Jatinder Singh
Zhiwei Steven Wu
Kenneth Holstein
Haiyi Zhu
41
88
0
13 May 2022
How Platform-User Power Relations Shape Algorithmic Accountability: A Case Study of Instant Loan Platforms and Financially Stressed Users in India
Divya Ramesh
Vaishnav Kameswaran
Ding-wen Wang
Nithya Sambasivan
22
35
0
11 May 2022
Evaluation Gaps in Machine Learning Practice
Ben Hutchinson
Negar Rostamzadeh
Christina Greer
Katherine A. Heller
Vinodkumar Prabhakaran
ELM
28
56
0
11 May 2022
Previous
1
2
3
...
12
13
14
...
18
19
20
Next