ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.09883
  4. Cited By
AI Safety Gridworlds

AI Safety Gridworlds

27 November 2017
Jan Leike
Miljan Martic
Victoria Krakovna
Pedro A. Ortega
Tom Everitt
Andrew Lefrancq
Laurent Orseau
Shane Legg
ArXivPDFHTML

Papers citing "AI Safety Gridworlds"

50 / 144 papers shown
Title
Incorporating Deception into CyberBattleSim for Autonomous Defense
Incorporating Deception into CyberBattleSim for Autonomous Defense
Erich Walter
Kimberly J. Ferguson-Walter
Ahmad Ridley
36
23
0
31 Aug 2021
Safe Learning in Robotics: From Learning-Based Control to Safe
  Reinforcement Learning
Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
Lukas Brunke
Melissa Greeff
Adam W. Hall
Zhaocong Yuan
Siqi Zhou
Jacopo Panerati
Angela P. Schoellig
OffRL
34
603
0
13 Aug 2021
Risk Averse Bayesian Reward Learning for Autonomous Navigation from
  Human Demonstration
Risk Averse Bayesian Reward Learning for Autonomous Navigation from Human Demonstration
Christian Ellis
Maggie B. Wigness
J. Rogers
Craig T. Lennon
L. Fiondella
90
6
0
31 Jul 2021
Taxonomy of Machine Learning Safety: A Survey and Primer
Taxonomy of Machine Learning Safety: A Survey and Primer
Sina Mohseni
Haotao Wang
Zhiding Yu
Chaowei Xiao
Zhangyang Wang
J. Yadawa
31
31
0
09 Jun 2021
There Is No Turning Back: A Self-Supervised Approach for
  Reversibility-Aware Reinforcement Learning
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning
Nathan Grinsztajn
Johan Ferret
Olivier Pietquin
Philippe Preux
M. Geist
SSL
34
14
0
08 Jun 2021
Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety
  Constraints in Finite MDPs
Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Harsh Satija
Philip S. Thomas
Joelle Pineau
Romain Laroche
OffRL
27
21
0
31 May 2021
Axes for Sociotechnical Inquiry in AI Research
Axes for Sociotechnical Inquiry in AI Research
Sarah Dean
T. Gilbert
Nathan Lambert
Tom Zick
25
12
0
26 Apr 2021
Understanding and Avoiding AI Failures: A Practical Guide
Understanding and Avoiding AI Failures: A Practical Guide
R. M. Williams
Roman V. Yampolskiy
35
24
0
22 Apr 2021
Training Value-Aligned Reinforcement Learning Agents Using a Normative
  Prior
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior
Md Sultan al Nahian
Spencer Frazier
Brent Harrison
Mark O. Riedl
27
18
0
19 Apr 2021
Alignment of Language Agents
Alignment of Language Agents
Zachary Kenton
Tom Everitt
Laura Weidinger
Iason Gabriel
Vladimir Mikulik
G. Irving
30
158
0
26 Mar 2021
Causal Analysis of Agent Behavior for AI Safety
Causal Analysis of Agent Behavior for AI Safety
Grégoire Delétang
Jordi Grau-Moya
Miljan Martic
Tim Genewein
Tom McGrath
Vladimir Mikulik
M. Kunesch
Shane Legg
Pedro A. Ortega
CML
32
6
0
05 Mar 2021
How RL Agents Behave When Their Actions Are Modified
How RL Agents Behave When Their Actions Are Modified
Eric D. Langlois
Tom Everitt
11
13
0
15 Feb 2021
Consequences of Misaligned AI
Consequences of Misaligned AI
Simon Zhuang
Dylan Hadfield-Menell
14
71
0
07 Feb 2021
Value Alignment Verification
Value Alignment Verification
Daniel S. Brown
Jordan Jack Schneider
Anca D. Dragan
S. Niekum
32
31
0
02 Dec 2020
Inverse Constrained Reinforcement Learning
Inverse Constrained Reinforcement Learning
Usman Anwar
Shehryar Malik
Alireza Aghasi
Ali Ahmed
18
58
0
19 Nov 2020
REALab: An Embedded Perspective on Tampering
REALab: An Embedded Perspective on Tampering
Ramana Kumar
J. Uesato
Richard Ngo
Tom Everitt
Victoria Krakovna
Shane Legg
30
10
0
17 Nov 2020
Model-based Reinforcement Learning from Signal Temporal Logic
  Specifications
Model-based Reinforcement Learning from Signal Temporal Logic Specifications
Parv Kapoor
Anand Balakrishnan
Jyotirmoy V. Deshmukh
23
22
0
10 Nov 2020
Avoiding Side Effects By Considering Future Tasks
Avoiding Side Effects By Considering Future Tasks
Victoria Krakovna
Laurent Orseau
Richard Ngo
Miljan Martic
Shane Legg
8
38
0
15 Oct 2020
Safety Aware Reinforcement Learning (SARL)
Safety Aware Reinforcement Learning (SARL)
Santiago Miret
Somdeb Majumdar
Carroll L. Wainwright
12
1
0
06 Oct 2020
Trust-Region Method with Deep Reinforcement Learning in Analog Design
  Space Exploration
Trust-Region Method with Deep Reinforcement Learning in Analog Design Space Exploration
Kai-En Yang
Chia-Yu Tsai
Hung-Hao Shen
Chen-Feng Chiang
Feng-Ming Tsai
Chunguang Wang
Yiju Ting
Chia-Shun Yeh
C. Lai
19
13
0
29 Sep 2020
Hidden Incentives for Auto-Induced Distributional Shift
Hidden Incentives for Auto-Induced Distributional Shift
David M. Krueger
Tegan Maharaj
Jan Leike
13
49
0
19 Sep 2020
Deep Reinforcement Learning for Closed-Loop Blood Glucose Control
Deep Reinforcement Learning for Closed-Loop Blood Glucose Control
Ian Fox
Joyce M. Lee
R. Pop-Busui
Jenna Wiens
BDL
OffRL
30
50
0
18 Sep 2020
Constrained Markov Decision Processes via Backward Value Functions
Constrained Markov Decision Processes via Backward Value Functions
Harsh Satija
P. Amortila
Joelle Pineau
34
51
0
26 Aug 2020
The Need for Advanced Intelligence in NFV Management and Orchestration
The Need for Advanced Intelligence in NFV Management and Orchestration
D. Manias
Abdallah Shami
25
24
0
03 Aug 2020
BabyAI 1.1
BabyAI 1.1
D. Y. Hui
Maxime Chevalier-Boisvert
Dzmitry Bahdanau
Yoshua Bengio
LLMAG
36
11
0
24 Jul 2020
AGI Agent Safety by Iteratively Improving the Utility Function
AGI Agent Safety by Iteratively Improving the Utility Function
K. Holtman
AI4CE
6
8
0
10 Jul 2020
Can Autonomous Vehicles Identify, Recover From, and Adapt to
  Distribution Shifts?
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
Angelos Filos
P. Tigas
R. McAllister
Nicholas Rhinehart
Sergey Levine
Y. Gal
22
185
0
26 Jun 2020
Avoiding Side Effects in Complex Environments
Avoiding Side Effects in Complex Environments
Alexander Matt Turner
Neale Ratzlaff
Prasad Tadepalli
30
34
0
11 Jun 2020
Constrained episodic reinforcement learning in concave-convex and
  knapsack settings
Constrained episodic reinforcement learning in concave-convex and knapsack settings
Kianté Brantley
Miroslav Dudík
Thodoris Lykouris
Sobhan Miryoosefi
Max Simchowitz
Aleksandrs Slivkins
Wen Sun
OffRL
28
51
0
09 Jun 2020
AI Research Considerations for Human Existential Safety (ARCHES)
AI Research Considerations for Human Existential Safety (ARCHES)
Andrew Critch
David M. Krueger
30
50
0
30 May 2020
Dynamic Cognition Applied to Value Learning in Artificial Intelligence
Dynamic Cognition Applied to Value Learning in Artificial Intelligence
N. D. Oliveira
N. Corrêa
9
0
0
12 May 2020
RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated
  Environments
RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments
Roberta Raileanu
Tim Rocktaschel
8
170
0
27 Feb 2020
TanksWorld: A Multi-Agent Environment for AI Safety Research
TanksWorld: A Multi-Agent Environment for AI Safety Research
Corban G. Rivera
Olivia Lyons
Arielle Summitt
Ayman Fatima
J. Pak
...
R. Chalmers
Aryeh Englander
Edward W. Staley
I. Wang
Ashley J. Llorens
13
2
0
25 Feb 2020
Safe Imitation Learning via Fast Bayesian Reward Inference from
  Preferences
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences
Daniel S. Brown
Russell Coleman
R. Srinivasan
S. Niekum
BDL
32
101
0
21 Feb 2020
AI safety: state of the field through quantitative lens
AI safety: state of the field through quantitative lens
Mislav Juric
A. Sandic
Mario Brčič
25
24
0
12 Feb 2020
Adversarial Machine Learning -- Industry Perspectives
Adversarial Machine Learning -- Industry Perspectives
Ramnath Kumar
Magnus Nyström
J. Lambert
Andrew Marshall
Mario Goertzel
Andi Comissoneru
Matt Swann
Sharon Xia
AAML
SILM
29
232
0
04 Feb 2020
Constrained Upper Confidence Reinforcement Learning
Constrained Upper Confidence Reinforcement Learning
Liyuan Zheng
Lillian J. Ratliff
33
67
0
26 Jan 2020
Practical Solutions for Machine Learning Safety in Autonomous Vehicles
Practical Solutions for Machine Learning Safety in Autonomous Vehicles
Sina Mohseni
Mandar Pitale
Vasu Singh
Zhangyang Wang
33
67
0
20 Dec 2019
Relational Mimic for Visual Adversarial Imitation Learning
Relational Mimic for Visual Adversarial Imitation Learning
Lionel Blondé
Yichuan Tang
Jian Zhang
Russ Webb
36
0
0
18 Dec 2019
Safe Policies for Reinforcement Learning via Primal-Dual Methods
Safe Policies for Reinforcement Learning via Primal-Dual Methods
Santiago Paternain
Miguel Calvo-Fullana
Luiz F. O. Chamon
Alejandro Ribeiro
12
99
0
20 Nov 2019
Constrained Reinforcement Learning Has Zero Duality Gap
Constrained Reinforcement Learning Has Zero Duality Gap
Santiago Paternain
Luiz F. O. Chamon
Miguel Calvo-Fullana
Alejandro Ribeiro
6
188
0
29 Oct 2019
Never Worse, Mostly Better: Stable Policy Improvement in Deep
  Reinforcement Learning
Never Worse, Mostly Better: Stable Policy Improvement in Deep Reinforcement Learning
P. Khanna
Guy Tennenholtz
Nadav Merlis
Shie Mannor
Chen Tessler
OffRL
24
1
0
02 Oct 2019
Wield: Systematic Reinforcement Learning With Progressive Randomization
Wield: Systematic Reinforcement Learning With Progressive Randomization
Michael Schaarschmidt
Kai Fricke
Eiko Yoneki
19
2
0
15 Sep 2019
Learning Transferable Domain Priors for Safe Exploration in
  Reinforcement Learning
Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning
Thommen George Karimpanal
Santu Rana
Sunil R. Gupta
T. Tran
Svetha Venkatesh
OffRL
OnRL
9
10
0
10 Sep 2019
Opponent Aware Reinforcement Learning
Opponent Aware Reinforcement Learning
Víctor Gallego
Roi Naveiro
D. Insua
D. Gómez‐Ullate
19
7
0
22 Aug 2019
Reward Tampering Problems and Solutions in Reinforcement Learning: A
  Causal Influence Diagram Perspective
Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective
Tom Everitt
Marcus Hutter
Ramana Kumar
Victoria Krakovna
17
92
0
13 Aug 2019
Corrigibility with Utility Preservation
Corrigibility with Utility Preservation
K. Holtman
KELM
6
8
0
05 Aug 2019
Google Research Football: A Novel Reinforcement Learning Environment
Google Research Football: A Novel Reinforcement Learning Environment
Karol Kurach
Anton Raichuk
Piotr Stańczyk
Michal Zajac
Olivier Bachem
...
C. Riquelme
Damien Vincent
Marcin Michalski
Olivier Bousquet
Sylvain Gelly
54
398
0
25 Jul 2019
The Role of Cooperation in Responsible AI Development
The Role of Cooperation in Responsible AI Development
Amanda Askell
Miles Brundage
Gillian Hadfield
33
60
0
10 Jul 2019
Generalizing from a few environments in safety-critical reinforcement
  learning
Generalizing from a few environments in safety-critical reinforcement learning
Zachary Kenton
Angelos Filos
Owain Evans
Y. Gal
12
16
0
02 Jul 2019
Previous
123
Next