Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2011.08820
Cited By

REALab: An Embedded Perspective on Tampering

REALab: An Embedded Perspective on Tampering

17 November 2020

Victoria Krakovna

ArXiv (abs)PDF HTML

Papers citing "REALab: An Embedded Perspective on Tampering"

7 / 7 papers shown

Solving math word problems with process- and outcome-based feedback

Solving math word problems with process- and outcome-based feedback

Antonia Creswell

FaML ReLM AIMat LRM

425

640

0

25 Nov 2022

Defining and Characterizing Reward Hacking

Defining and Characterizing Reward Hacking

Nikolaus H. R. Howe

Dmitrii Krasheninnikov

David M. Krueger

498

113

0

27 Sep 2022

Is Power-Seeking AI an Existential Risk?

Is Power-Seeking AI an Existential Risk?

Joseph Carlsmith

237

138

0

16 Jun 2022

Estimating and Penalizing Induced Preference Shifts in Recommender
Systems

Estimating and Penalizing Induced Preference Shifts in Recommender SystemsInternational Conference on Machine Learning (ICML), 2022

Stuart J. Russell

Dylan Hadfield-Menell

399

49

0

25 Apr 2022

Safe Deep RL in 3D Environments using Human Feedback

Safe Deep RL in 3D Environments using Human Feedback

262

6

0

20 Jan 2022

On the Expressivity of Markov Reward

On the Expressivity of Markov RewardNeural Information Processing Systems (NeurIPS), 2021

Anna Harutyunyan

Michael L. Littman

293

98

0

01 Nov 2021

Counterfactual Planning in AGI Systems

Counterfactual Planning in AGI Systems

182

4

0

29 Jan 2021

Page 1 of 1