ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.06565
  4. Cited By
Concrete Problems in AI Safety
v1v2 (latest)

Concrete Problems in AI Safety

21 June 2016
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
ArXiv (abs)PDFHTML

Papers citing "Concrete Problems in AI Safety"

50 / 1,371 papers shown
Title
Stability-certified reinforcement learning: A control-theoretic
  perspective
Stability-certified reinforcement learning: A control-theoretic perspective
Ming Jin
Javad Lavaei
107
97
0
26 Oct 2018
The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep
  Reinforcement Learning
The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning
Vahid Behzadan
Arslan Munir
141
27
0
23 Oct 2018
Safe Reinforcement Learning with Model Uncertainty Estimates
Safe Reinforcement Learning with Model Uncertainty Estimates
Björn Lütjens
Michael Everett
Jonathan P. How
212
184
0
19 Oct 2018
Supervising strong learners by amplifying weak experts
Supervising strong learners by amplifying weak experts
Paul Christiano
Buck Shlegeris
Dario Amodei
117
147
0
19 Oct 2018
Multiparty Dynamics and Failure Modes for Machine Learning and
  Artificial Intelligence
Multiparty Dynamics and Failure Modes for Machine Learning and Artificial Intelligence
David Manheim
92
26
0
16 Oct 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLMOffRL
295
144
0
15 Oct 2018
Semi-supervised Deep Reinforcement Learning in Support of IoT and Smart
  City Services
Semi-supervised Deep Reinforcement Learning in Support of IoT and Smart City Services
M. Mohammadi
Ala I. Al-Fuqaha
Mohsen Guizani
Jun-Seok Oh
OffRLHAI
182
364
0
09 Oct 2018
Scenic: A Language for Scenario Specification and Scene Generation
Scenic: A Language for Scenario Specification and Scene GenerationACM-SIGPLAN Symposium on Programming Language Design and Implementation (PLDI), 2018
Daniel J. Fremont
T. Dreossi
Shromona Ghosh
Xiangyu Yue
Alberto L. Sangiovanni-Vincentelli
Sanjit A. Seshia
214
286
0
25 Sep 2018
Interpretable Multi-Objective Reinforcement Learning through Policy
  Orchestration
Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration
Ritesh Noothigattu
Djallel Bouneffouf
Nicholas Mattei
Rachita Chandra
Piyush Madan
Kush R. Varshney
Murray Campbell
Moninder Singh
F. Rossi
AI4CE
160
24
0
21 Sep 2018
Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing
  System Failure
Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure
Besmira Nushi
Ece Kamar
Eric Horvitz
257
146
0
19 Sep 2018
Automata Guided Reinforcement Learning With Demonstrations
Automata Guided Reinforcement Learning With Demonstrations
Xiao Li
Yao Ma
C. Belta
OffRL
155
12
0
17 Sep 2018
Deep Network Uncertainty Maps for Indoor Navigation
Deep Network Uncertainty Maps for Indoor Navigation
Francesco Verdoja
Jens Lundell
Ville Kyrki
UQCV
131
23
0
13 Sep 2018
Active Inverse Reward Design
Active Inverse Reward Design
Sören Mindermann
Rohin Shah
Adam Gleave
Dylan Hadfield-Menell
177
20
0
09 Sep 2018
Emergence of Human-comparable Balancing Behaviors by Deep Reinforcement
  Learning
Emergence of Human-comparable Balancing Behaviors by Deep Reinforcement Learning
Chuanyu Yang
Taku Komura
Zhibin Li
107
20
0
06 Sep 2018
A Roadmap for Robust End-to-End Alignment
A Roadmap for Robust End-to-End Alignment
L. Hoang
186
1
0
04 Sep 2018
Out-of-Distribution Detection using Multiple Semantic Label
  Representations
Out-of-Distribution Detection using Multiple Semantic Label Representations
Gabi Shalev
Yossi Adi
Joseph Keshet
OODD
187
90
0
20 Aug 2018
Using Machine Learning Safely in Automotive Software: An Assessment and
  Adaption of Software Process Requirements in ISO 26262
Using Machine Learning Safely in Automotive Software: An Assessment and Adaption of Software Process Requirements in ISO 26262
Rick Salay
Krzysztof Czarnecki
178
72
0
05 Aug 2018
The Lyapunov Neural Network: Adaptive Stability Certification for Safe
  Learning of Dynamical Systems
The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems
Spencer M. Richards
Felix Berkenkamp
Andreas Krause
140
250
0
02 Aug 2018
Multi-Agent Generative Adversarial Imitation Learning
Multi-Agent Generative Adversarial Imitation Learning
Jiaming Song
Hongyu Ren
Dorsa Sadigh
Stefano Ermon
GAN
175
243
0
26 Jul 2018
EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning
EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning
Kunal Menda
Katherine Driggs-Campbell
Mykel J. Kochenderfer
309
136
0
22 Jul 2018
Safe Option-Critic: Learning Safety in the Option-Critic Architecture
Safe Option-Critic: Learning Safety in the Option-Critic Architecture
Arushi Jain
Khimya Khetarpal
Doina Precup
192
28
0
21 Jul 2018
Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf
  restraining specifications
Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf restraining specificationsInternational Conference on Automated Planning and Scheduling (ICAPS), 2018
Giuseppe De Giacomo
Luca Iocchi
Marco Favorito
F. Patrizi
OffRL
229
131
0
17 Jul 2018
Preference-Based Monte Carlo Tree Search
Preference-Based Monte Carlo Tree SearchDeutsche Jahrestagung für Künstliche Intelligenz (KI), 2018
Tobias Joppen
J. Dietrich
Johannes Furnkranz
LRM
78
4
0
17 Jul 2018
Safe Reinforcement Learning via Probabilistic Shields
Safe Reinforcement Learning via Probabilistic Shields
N. Jansen
Bettina Könighofer
Sebastian Junges
A. Serban
Roderick Bloem
119
12
0
16 Jul 2018
A Simple Unified Framework for Detecting Out-of-Distribution Samples and
  Adversarial Attacks
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks
Kimin Lee
Kibok Lee
Honglak Lee
Jinwoo Shin
OODD
432
2,327
0
10 Jul 2018
A Broader View on Bias in Automated Decision-Making: Reflecting on
  Epistemology and Dynamics
A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics
Roel Dobbe
Sarah Dean
T. Gilbert
Nitin Kohli
143
44
0
02 Jul 2018
Leveraging Uncertainty Estimates for Predicting Segmentation Quality
Leveraging Uncertainty Estimates for Predicting Segmentation Quality
Terrance Devries
Graham W. Taylor
UQCV
185
119
0
02 Jul 2018
Learning to Drive in a Day
Learning to Drive in a DayIEEE International Conference on Robotics and Automation (ICRA), 2018
Alex Kendall
Jeffrey Hawke
David Janz
Przemyslaw Mazur
Daniele Reda
John M. Allen
Vinh-Dieu Lam
Alex Bewley
Amar Shah
248
728
0
01 Jul 2018
Modeling Friends and Foes
Modeling Friends and Foes
Pedro A. Ortega
Shane Legg
AAML
143
3
0
30 Jun 2018
The Virtuous Machine - Old Ethics for New Technology?
The Virtuous Machine - Old Ethics for New Technology?
Nicolas Berberich
Klaus Diepold
40
23
0
27 Jun 2018
Interpretable to Whom? A Role-based Model for Analyzing Interpretable
  Machine Learning Systems
Interpretable to Whom? A Role-based Model for Analyzing Interpretable Machine Learning Systems
Richard J. Tomsett
Dave Braines
Daniel Harborne
Alun D. Preece
Supriyo Chakraborty
FaML
206
182
0
20 Jun 2018
Combining Model-Free Q-Ensembles and Model-Based Approaches for Informed
  Exploration
Combining Model-Free Q-Ensembles and Model-Based Approaches for Informed Exploration
Sreecharan Sankaranarayanan
Raghuram Mandyam Annasamy
Katia Sycara
Carolyn Rose
96
0
0
12 Jun 2018
An Efficient, Generalized Bellman Update For Cooperative Inverse
  Reinforcement Learning
An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning
Dhruv Malik
Malayandi Palaniappan
J. F. Fisac
Dylan Hadfield-Menell
Stuart J. Russell
Anca Dragan
123
34
0
11 Jun 2018
POTs: Protective Optimization Technologies
POTs: Protective Optimization Technologies
B. Kulynych
R. Overdorf
Carmela Troncoso
Seda F. Gürses
303
98
0
07 Jun 2018
Simplifying Reward Design through Divide-and-Conquer
Simplifying Reward Design through Divide-and-Conquer
Ellis Ratner
Dylan Hadfield-Menell
Anca Dragan
145
30
0
07 Jun 2018
Penalizing side effects using stepwise relative reachability
Penalizing side effects using stepwise relative reachability
Victoria Krakovna
Laurent Orseau
Ramana Kumar
Miljan Martic
Shane Legg
223
58
0
04 Jun 2018
Learning convex bounds for linear quadratic control policy synthesis
Learning convex bounds for linear quadratic control policy synthesis
Jack Umenberger
Thomas B. Schon
167
12
0
01 Jun 2018
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
Kelvin Xu
Ellis Ratner
Anca Dragan
Sergey Levine
Chelsea Finn
234
66
0
31 May 2018
To Trust Or Not To Trust A Classifier
To Trust Or Not To Trust A Classifier
Heinrich Jiang
Been Kim
Melody Y. Guan
Maya R. Gupta
UQCV
413
492
0
30 May 2018
Variational Inverse Control with Events: A General Framework for
  Data-Driven Reward Definition
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
Justin Fu
Avi Singh
Dibya Ghosh
Larry Yang
Sergey Levine
BDL
265
129
0
29 May 2018
Learning Safe Policies with Expert Guidance
Learning Safe Policies with Expert Guidance
Je-chun Huang
Fa Wu
Doina Precup
Yang Cai
160
26
0
21 May 2018
A Lyapunov-based Approach to Safe Reinforcement Learning
A Lyapunov-based Approach to Safe Reinforcement Learning
Yinlam Chow
Ofir Nachum
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
275
560
0
20 May 2018
Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for
  Map-less Navigation by Leveraging Prior Demonstrations
Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Map-less Navigation by Leveraging Prior Demonstrations
Mark Pfeiffer
Samarth Shukla
M. Turchetta
Cesar Cadena
Andreas Krause
Roland Siegwart
Juan I. Nieto
139
170
0
18 May 2018
Reachability Analysis of Deep Neural Networks with Provable Guarantees
Reachability Analysis of Deep Neural Networks with Provable Guarantees
Wenjie Ruan
Xiaowei Huang
Marta Kwiatkowska
AAML
177
278
0
06 May 2018
AGI Safety Literature Review
AGI Safety Literature Review
Tom Everitt
G. Lea
Marcus Hutter
AI4CE
154
124
0
03 May 2018
AI safety via debate
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
407
294
0
02 May 2018
On Learning Intrinsic Rewards for Policy Gradient Methods
On Learning Intrinsic Rewards for Policy Gradient Methods
Zeyu Zheng
Junhyuk Oh
Satinder Singh
214
219
0
17 Apr 2018
FPR -- Fast Path Risk Algorithm to Evaluate Collision Probability
FPR -- Fast Path Risk Algorithm to Evaluate Collision Probability
A. Blake
Alejandro Bordallo
Kamen Brestnichki
Majd Hawasly
Svetlin Penkov
S. Ramamoorthy
Alexandre Silva
85
6
0
15 Apr 2018
Incomplete Contracting and AI Alignment
Incomplete Contracting and AI Alignment
Dylan Hadfield-Menell
Gillian Hadfield
158
97
0
12 Apr 2018
Toward Intelligent Vehicular Networks: A Machine Learning Framework
Toward Intelligent Vehicular Networks: A Machine Learning Framework
Le Liang
Hao Ye
Geoffrey Ye Li
130
219
0
01 Apr 2018
Previous
123...25262728
Next