Agent Incentives: A Causal Perspective

Agent Incentives: A Causal Perspective

2 February 2021

Eric D. Langlois

Pedro A. Ortega

Papers citing "Agent Incentives: A Causal Perspective"

15 / 15 papers shown

Title
Why human-AI relationships need socioaffective alignment Hannah Rose Kirk Iason Gabriel Chris Summerfield Bertie Vidgen Scott A. Hale 51 6 0 04 Feb 2025
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking Sebastian Farquhar Vikrant Varma David Lindner David Elson Caleb Biddulph Ian Goodfellow Rohin Shah 96 1 0 22 Jan 2025
Designing Fiduciary Artificial Intelligence Sebastian Benthall David Shekman 53 4 0 27 Jul 2023
On Imperfect Recall in Multi-Agent Influence Diagrams James Fox Matt MacDermott Lewis Hammond Paul Harrenstein Alessandro Abate Michael Wooldridge 34 3 0 11 Jul 2023
Unfair Utilities and First Steps Towards Improving Them Frederik Hytting Jorgensen S. Weichwald J. Peters FaML 61 0 0 01 Jun 2023
Model evaluation for extreme risks Toby Shevlane Sebastian Farquhar Ben Garfinkel Mary Phuong Jess Whittlestone ... Vijay Bolina Jack Clark Yoshua Bengio Paul Christiano Allan Dafoe ELM 51 152 0 24 May 2023
Reasoning about Causality in Games Lewis Hammond James Fox Tom Everitt Ryan Carey Alessandro Abate Michael Wooldridge LRM AI4CE 14 15 0 05 Jan 2023
Solutions to preference manipulation in recommender systems require knowledge of meta-preferences Hal Ashton Matija Franklin 18 5 0 14 Sep 2022
Discovering Agents Zachary Kenton Ramana Kumar Sebastian Farquhar Jonathan G. Richens Matt MacDermott Tom Everitt CML 57 31 0 17 Aug 2022
Scoring Rules for Performative Binary Prediction Alan Chan 31 1 0 05 Jul 2022
Preference Change in Persuasive Robotics Matija Franklin Hal Ashton 13 1 0 21 Jun 2022
Active learning of causal probability trees Tue Herlau CML 20 0 0 17 May 2022
A Complete Criterion for Value of Information in Soluble Influence Diagrams Chris van Merwijk Ryan Carey Tom Everitt 26 5 0 23 Feb 2022
Truthful AI: Developing and governing AI that does not lie Owain Evans Owen Cotton-Barratt Lukas Finnveden Adam Bales Avital Balwit Peter Wills Luca Righetti William Saunders HILM 238 111 0 13 Oct 2021
Intelligence and Unambitiousness Using Algorithmic Information Theory Michael K. Cohen Badri N. Vellambi Marcus Hutter 16 2 0 13 May 2021