Designing for Human-Agent Alignment: Understanding what humans want from their agents

4 April 2024

Papers citing "Designing for Human-Agent Alignment: Understanding what humans want from their agents"

5 / 5 papers shown

Title
AI Chatbots for Mental Health: Values and Harms from Lived Experiences of Depression Dong Whi Yoo Jiayue Melissa Shi Violeta J. Rodriguez Koustuv Saha AI4MH 48 0 0 26 Apr 2025
VeriLA: A Human-Centered Evaluation Framework for Interpretable Verification of LLM Agent Failures Yoo Yeon Sung H. Kim Dan Zhang 58 1 0 16 Mar 2025
AI Safety in Generative AI Large Language Models: A Survey Jaymari Chua Yun Yvonna Li Shiyi Yang Chen Wang Lina Yao LM&MA 34 12 0 06 Jul 2024
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,402 0 28 Jan 2022
AI safety via debate G. Irving Paul Christiano Dario Amodei 199 199 0 02 May 2018