"Correct answers" from the psychology of artificial intelligence
- ELMAI4CEALM
Large Language Models have vastly grown in capabilities. One proposed application of such AI systems is to support data collection in the social and cognitive sciences, where perfect experimental control is currently unfeasible and the collection of large, representative datasets is generally expensive. In this paper, we re-replicate 14 studies from the Many Labs 2 replication project with OpenAI's text-davinci-003 model, colloquially known as GPT3.5. We collected responses from the default setting of GPT3.5 by inputting each study's survey as text. Among the eight studies we could analyse, our GPT sample replicated 37.5% of the original results as well as 37.5% of the Many Labs 2 results. Unexpectedly, we could not analyse the remaining six studies as we had planned in our pre-registration. This was because for each of these six studies, GPT3.5 answered at least one of the survey questions (either a dependent variable or a condition variable) in an extremely predetermined way: an unexpected phenomenon we call the "correct answer" effect. Different runs of GPT3.5 answered nuanced questions probing political orientation, economic preference, judgement, and moral philosophy with zero or near-zero variation in responses: with the supposedly "correct answer." For example, our survey questions found the default setting of GPT3.5 to almost always self-identify as a maximally strong conservative (99.6%, N=1,030), and to always be morally deontological in opposing the hypothetical pushing of a large man in front of an incoming trolley to save the lives of five people (100%, N=1,030). Since AI models of the future may be trained on much of the same data as GPT3.5, training data from which GPT3.5 may have learned its supposedly "correct answers," our results raise concerns that a hypothetical AI-led future may in certain ways be subject to a diminished diversity of thought.
View on arXiv