70
v1v2 (latest)

AI reasoning effort predicts human decision time in content moderation

Main:6 Pages
6 Figures
Bibliography:4 Pages
5 Tables
Appendix:7 Pages
Abstract

Large language models can now generate intermediate reasoning steps before producing answers, improving performance on difficult problems by interactively developing solutions. This study uses a content moderation task to examine parallels between human decision times and model reasoning effort, measured using the length of the chain-of-thought (CoT). Across three frontier models, CoT length consistently predicts human decision time. Moreover, humans took longer and models produced longer CoTs when important variables were held constant, suggesting similar sensitivity to task difficulty. Analyses of the CoT content shows that models reference various contextual factors more frequently when making such decisions. These findings show parallels between human and AI reasoning on practical tasks and underscore the potential of reasoning traces for enhancing interpretability and decision-making.

View on arXiv
Comments on this paper