v1v2 (latest)

When AI Gives Advice: Evaluating AI and Human Responses to Online Advice-Seeking for Well-Being

24 October 2025

Harsh Kumar

Jasmine Chahal

Yinuo Zhao

Zeling Zhang

Annika Wei

Louis Tay

Ashton Anderson

ArXiv (abs)PDF HTML Github

Main:15 Pages

12 Figures

Bibliography:4 Pages

3 Tables

Appendix:1 Pages

Abstract

Seeking advice is a core human behavior that the internet has reinvented twice: first through forums and Q&A communities that crowdsource public guidance, and now through large language models (LLMs). Yet the quality of this LLM advice for everyday well-being scenarios remains unclear. How does it compare, not only against human comments, but against the wisdom of the online crowd? We ran two studies (N=210) in which experts compared top-voted Reddit advice with LLM-generated advice. LLMs ranked significantly higher overall and on effectiveness, warmth, and willingness to seek advice again. GPT-4o beat GPT-5 on all metrics except sycophancy, suggesting that benchmark gains need not improve advice-giving. In Study-2, we examined how human and algorithmic advice could be combined, and found that human advice can be unobtrusively polished to compete with AI-generated comments. We conclude with design implications for advice-giving agents and ecosystems blending AI, crowd input, and expert oversight.

View on arXiv

Comments on this paper