ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1605.03143
  4. Cited By
Avoiding Wireheading with Value Reinforcement Learning

Avoiding Wireheading with Value Reinforcement Learning

10 May 2016
Tom Everitt
Marcus Hutter
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "Avoiding Wireheading with Value Reinforcement Learning"

14 / 14 papers shown
Title
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
Sebastian Farquhar
Vikrant Varma
David Lindner
David Elson
Caleb Biddulph
Ian Goodfellow
Rohin Shah
178
2
0
22 Jan 2025
Identifiability and generalizability from multiple experts in Inverse
  Reinforcement Learning
Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning
Paul Rolland
Luca Viano
Norman Schuerhoff
Boris Nikolov
Volkan Cevher
OffRL
81
14
0
22 Sep 2022
Morality, Machines and the Interpretation Problem: A Value-based,
  Wittgensteinian Approach to Building Moral Agents
Morality, Machines and the Interpretation Problem: A Value-based, Wittgensteinian Approach to Building Moral Agents
C. Badea
Gregory Artus
106
9
0
03 Mar 2021
REALab: An Embedded Perspective on Tampering
REALab: An Embedded Perspective on Tampering
Ramana Kumar
J. Uesato
Richard Ngo
Tom Everitt
Victoria Krakovna
Shane Legg
72
10
0
17 Nov 2020
Positive-Unlabeled Reward Learning
Positive-Unlabeled Reward Learning
Danfei Xu
Misha Denil
82
38
0
01 Nov 2019
Rethinking Formal Models of Partially Observable Multiagent Decision
  Making
Rethinking Formal Models of Partially Observable Multiagent Decision Making
Vojtěch Kovařík
Martin Schmid
Neil Burch
Michael Bowling
Viliam Lisý
OffRL
144
56
0
26 Jun 2019
Categorizing Wireheading in Partially Embedded Agents
Categorizing Wireheading in Partially Embedded Agents
Arushi G. K. Majha
Sayan Sarkar
Davide Zagami
36
3
0
21 Jun 2019
Imitation Learning from Imperfect Demonstration
Imitation Learning from Imperfect Demonstration
Yueh-hua Wu
Nontawat Charoenphakdee
Han Bao
Voot Tangkaratt
Masashi Sugiyama
73
162
0
27 Jan 2019
Emergence of Addictive Behaviors in Reinforcement Learning Agents
Emergence of Addictive Behaviors in Reinforcement Learning Agents
Vahid Behzadan
Roman V. Yampolskiy
Arslan Munir
30
5
0
14 Nov 2018
The Surprising Creativity of Digital Evolution: A Collection of
  Anecdotes from the Evolutionary Computation and Artificial Life Research
  Communities
The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities
Joel Lehman
Jeff Clune
D. Misevic
C. Adami
L. Altenberg
...
Danesh Tarapore
S. Thibault
Westley Weimer
R. Watson
Jason Yosinksi
177
282
0
09 Mar 2018
Occam's razor is insufficient to infer the preferences of irrational
  agents
Occam's razor is insufficient to infer the preferences of irrational agents
Stuart Armstrong
Sören Mindermann
102
93
0
15 Dec 2017
Nonparametric General Reinforcement Learning
Nonparametric General Reinforcement Learning
Jan Leike
OffRL
105
26
0
28 Nov 2016
Concrete Problems in AI Safety
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
315
2,407
0
21 Jun 2016
Self-Modification of Policy and Utility Function in Rational Agents
Self-Modification of Policy and Utility Function in Rational Agents
Tom Everitt
Daniel Filan
Mayank Daswani
Marcus Hutter
77
29
0
10 May 2016
1