ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.10018
  4. Cited By
Path-Specific Objectives for Safer Agent Incentives

Path-Specific Objectives for Safer Agent Incentives

21 April 2022
Sebastian Farquhar
Ryan Carey
Tom Everitt
ArXivPDFHTML

Papers citing "Path-Specific Objectives for Safer Agent Incentives"

6 / 6 papers shown
Title
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
Sebastian Farquhar
Vikrant Varma
David Lindner
David Elson
Caleb Biddulph
Ian Goodfellow
Rohin Shah
88
1
0
22 Jan 2025
On Imperfect Recall in Multi-Agent Influence Diagrams
On Imperfect Recall in Multi-Agent Influence Diagrams
James Fox
Matt MacDermott
Lewis Hammond
Paul Harrenstein
Alessandro Abate
Michael Wooldridge
29
3
0
11 Jul 2023
Solutions to preference manipulation in recommender systems require
  knowledge of meta-preferences
Solutions to preference manipulation in recommender systems require knowledge of meta-preferences
Hal Ashton
Matija Franklin
13
5
0
14 Sep 2022
The Alignment Problem from a Deep Learning Perspective
The Alignment Problem from a Deep Learning Perspective
Richard Ngo
Lawrence Chan
Sören Mindermann
65
183
0
30 Aug 2022
Counterfactual harm
Counterfactual harm
Jonathan G. Richens
R. Beard
Daniel H. Thompson
29
27
0
27 Apr 2022
A Complete Criterion for Value of Information in Soluble Influence
  Diagrams
A Complete Criterion for Value of Information in Soluble Influence Diagrams
Chris van Merwijk
Ryan Carey
Tom Everitt
24
5
0
23 Feb 2022
1