ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.01789
78
0
v1v2 (latest)

Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability

2 June 2025
Genta Indra Winata
David Anugraha
Emmy Liu
Alham Fikri Aji
Shou-Yi Hung
Aditya Parashar
Patrick Amadeus Irawan
Ruochen Zhang
Zheng-Xin Yong
Jan Christian Blaise Cruz
Niklas Muennighoff
Seungone Kim
Hanyang Zhao
Sudipta Kar
Kezia Erina Suryoraharjo
Muhammad Farid Adilazuarda
En-Shiun Annie Lee
Ayu Purwarianti
Derry Wijaya
Monojit Choudhury
ArXiv (abs)PDFHTML
Main:9 Pages
20 Figures
Bibliography:3 Pages
1 Tables
Appendix:32 Pages
Abstract

High-quality datasets are fundamental to training and evaluating machine learning models, yet their creation-especially with accurate human annotations-remains a significant challenge. Many dataset paper submissions lack originality, diversity, or rigorous quality control, and these shortcomings are often overlooked during peer review. Submissions also frequently omit essential details about dataset construction and properties. While existing tools such as datasheets aim to promote transparency, they are largely descriptive and do not provide standardized, measurable methods for evaluating data quality. Similarly, metadata requirements at conferences promote accountability but are inconsistently enforced. To address these limitations, this position paper advocates for the integration of systematic, rubric-based evaluation metrics into the dataset review process-particularly as submission volumes continue to grow. We also explore scalable, cost-effective methods for synthetic data generation, including dedicated tools and LLM-as-a-judge approaches, to support more efficient evaluation. As a call to action, we introduce DataRubrics, a structured framework for assessing the quality of both human- and model-generated datasets. Leveraging recent advances in LLM-based evaluation, DataRubrics offers a reproducible, scalable, and actionable solution for dataset quality assessment, enabling both authors and reviewers to uphold higher standards in data-centric research. We also release code to support reproducibility of LLM-based evaluations at this https URL.

View on arXiv
@article{winata2025_2506.01789,
  title={ Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability },
  author={ Genta Indra Winata and David Anugraha and Emmy Liu and Alham Fikri Aji and Shou-Yi Hung and Aditya Parashar and Patrick Amadeus Irawan and Ruochen Zhang and Zheng-Xin Yong and Jan Christian Blaise Cruz and Niklas Muennighoff and Seungone Kim and Hanyang Zhao and Sudipta Kar and Kezia Erina Suryoraharjo and M. Farid Adilazuarda and En-Shiun Annie Lee and Ayu Purwarianti and Derry Tanti Wijaya and Monojit Choudhury },
  journal={arXiv preprint arXiv:2506.01789},
  year={ 2025 }
}
Comments on this paper