ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1605.03142
  4. Cited By
Self-Modification of Policy and Utility Function in Rational Agents

Self-Modification of Policy and Utility Function in Rational Agents

10 May 2016
Tom Everitt
Daniel Filan
Mayank Daswani
Marcus Hutter
ArXiv (abs)PDFHTML

Papers citing "Self-Modification of Policy and Utility Function in Rational Agents"

11 / 11 papers shown
Title
Reward Shaping to Mitigate Reward Hacking in RLHF
Reward Shaping to Mitigate Reward Hacking in RLHF
Jiayi Fu
Xuandong Zhao
Chengyuan Yao
Han Wang
Qi Han
Yanghua Xiao
202
14
0
26 Feb 2025
Towards shutdownable agents via stochastic choice
Towards shutdownable agents via stochastic choice
Elliott Thornley
Alexander Roman
Christos Ziakas
Leyton Ho
Louis Thomson
140
0
0
30 Jun 2024
Reward Tampering Problems and Solutions in Reinforcement Learning: A
  Causal Influence Diagram Perspective
Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective
Tom Everitt
Marcus Hutter
Ramana Kumar
Victoria Krakovna
105
97
0
13 Aug 2019
Corrigibility with Utility Preservation
Corrigibility with Utility Preservation
K. Holtman
KELM
45
9
0
05 Aug 2019
Modeling AGI Safety Frameworks with Causal Influence Diagrams
Modeling AGI Safety Frameworks with Causal Influence Diagrams
Tom Everitt
Ramana Kumar
Victoria Krakovna
Shane Legg
AI4CE
67
22
0
20 Jun 2019
AGI Safety Literature Review
AGI Safety Literature Review
Tom Everitt
G. Lea
Marcus Hutter
AI4CE
86
116
0
03 May 2018
Índifference' methods for managing agent rewards
Índifference' methods for managing agent rewards
Stuart Armstrong
Xavier O'Rourke
89
19
0
18 Dec 2017
AI Safety Gridworlds
AI Safety Gridworlds
Jan Leike
Miljan Martic
Victoria Krakovna
Pedro A. Ortega
Tom Everitt
Andrew Lefrancq
Laurent Orseau
Shane Legg
158
255
0
27 Nov 2017
Nonparametric General Reinforcement Learning
Nonparametric General Reinforcement Learning
Jan Leike
OffRL
105
26
0
28 Nov 2016
Concrete Problems in AI Safety
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
315
2,407
0
21 Jun 2016
Avoiding Wireheading with Value Reinforcement Learning
Avoiding Wireheading with Value Reinforcement Learning
Tom Everitt
Marcus Hutter
AI4CE
129
44
0
10 May 2016
1