ElicitationGPT: Text Elicitation Mechanisms via Language Models

13 June 2024

Yifan Wu

Jason D. Hartline

ArXiv (abs)PDF HTML

Main:15 Pages

1 Figures

Bibliography:3 Pages

8 Tables

Appendix:17 Pages

Abstract

Scoring rules evaluate probabilistic forecasts of an unknown state against the realized state and are a fundamental building block in the incentivized elicitation of information and the training of machine learning models. This paper develops mechanisms for scoring elicited text against ground truth text using domain-knowledge-free queries to a large language model (specifically ChatGPT) and empirically evaluates their alignment with human preferences. The empirical evaluation is conducted on peer reviews from a peer-grading dataset and in comparison to manual instructor scores for the peer reviews.

View on arXiv

Comments on this paper