288
v1v2v3 (latest)

ElicitationGPT: Text Elicitation Mechanisms via Language Models

Main:15 Pages
1 Figures
Bibliography:3 Pages
8 Tables
Appendix:17 Pages
Abstract

Scoring rules evaluate probabilistic forecasts of an unknown state against the realized state and are a fundamental building block in the incentivized elicitation of information. This paper develops mechanisms for scoring elicited text against ground truth text by reducing the textual information elicitation problem to a forecast elicitation problem, via domain-knowledge-free queries to a large language model (specifically ChatGPT), and empirically evaluates their alignment with human preferences. Our theoretical analysis shows that the reduction achieves provable properness via black-box language models. The empirical evaluation is conducted on peer reviews from a peer-grading dataset, in comparison to manual instructor scores for the peer reviews.Our results suggest a paradigm of algorithmic artificial intelligence that may be useful for developing artificial intelligence technologies with provable guarantees.

View on arXiv
Comments on this paper