Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia

Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia

21 February 2024

Aaron L Halfaker

Kenneth Holstein

Haiyi Zhu

Papers citing "Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia"

5 / 5 papers shown

Title
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection Yingli Shen Wen Lai Shuo Wang Xueren Zhang Kangyang Luo Alexander M. Fraser Maosong Sun 47 0 0 17 Feb 2025
Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review Rock Yuren Pang Hope Schroeder Kynnedy Simone Smith Solon Barocas Ziang Xiao Emily Tseng Danielle Bragg 73 3 0 22 Jan 2025
A Roadmap to Pluralistic Alignment Taylor Sorensen Jared Moore Jillian R. Fisher Mitchell L. Gordon Niloofar Mireshghallah ... Liwei Jiang Ximing Lu Nouha Dziri Tim Althoff Yejin Choi 65 75 0 07 Feb 2024
Discovering and Validating AI Errors With Crowdsourced Failure Reports Ángel Alexander Cabrera Abraham J. Druck Jason I. Hong Adam Perer HAI 34 53 0 23 Sep 2021
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers Kenny Peng Arunesh Mathur Arvind Narayanan 97 92 0 06 Aug 2021