v1v2 (latest)

Concrete Problems in AI Safety

21 June 2016

Papers citing "Concrete Problems in AI Safety"

29 / 1,379 papers shown

Deep reinforcement learning from human preferencesNeural Information Processing Systems (NeurIPS), 2017

1.6K

4,461

12 Jun 2017

Enhancing The Reliability of Out-of-distribution Image Detection in Neural NetworksInternational Conference on Learning Representations (ICLR), 2017

1.1K

2,317

08 Jun 2017

Constrained Policy OptimizationInternational Conference on Machine Learning (ICML), 2017

Joshua Achiam

David Held

Aviv Tamar

Pieter Abbeel

1.4K

1,588

30 May 2017

Safe Model-based Reinforcement Learning with Stability Guarantees

554

930

23 May 2017

Reinforcement Learning with a Corrupted Reward Channel

329

118

23 May 2017

Alex Kendall

639

645

22 May 2017

Ensemble Adversarial Training: Attacks and Defenses

Dan Boneh

510

2,944

19 May 2017

Repeated Inverse Reinforcement Learning

Kareem Amin

Nan Jiang

Satinder Singh

339

15 May 2017

Probabilistically Safe Policy Transfer

Pieter Abbeel

153

15 May 2017

Maximum Resilience of Artificial Neural Networks

395

298

28 Apr 2017

Google's Cloud Vision API Is Not Robust To Noise

179

129

16 Apr 2017

Enter the Matrix: Safely Interruptible Autonomous Systems via Virtualization

Mark O. Riedl

Brent Harrison

30 Mar 2017

Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos

110

26 Mar 2017

Blocking Transferability of Adversarial Examples in Black-Box Learning Systems

156

111

13 Mar 2017

Dropout Inference in Bayesian Neural Networks with Alpha-divergences

Yingzhen Li

Y. Gal

UQCV BDL

230

206

08 Mar 2017

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez

Been Kim

XAI FaML

701

4,555

28 Feb 2017

Strongly-Typed Agents are Guaranteed to Interact SafelyInternational Conference on Machine Learning (ICML), 2017

David Balduzzi

225

24 Feb 2017

Deep Reinforcement Learning: An Overview

Yuxi Li

OffRL VLM

959

1,742

25 Jan 2017

Stoic Ethics for Artificial Agents

Gabriel Murray

112

09 Jan 2017

Reinforcement Learning With Temporal Logic Rewards

Xiao Li

C. Vasile

C. Belta

248

246

11 Dec 2016

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Balaji Lakshminarayanan

Alexander Pritzel

Charles Blundell

UQCV BDL

1.5K

6,801

05 Dec 2016

Generalizing Skills with Semi-Supervised Reinforcement Learning

Pieter Abbeel

198

01 Dec 2016

On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems

173

24 Nov 2016

Towards the Science of Security and Privacy in Machine Learning

235

493

11 Nov 2016

Artificial Intelligence Safety and Cybersecurity: a Timeline of AI Failures

Roman V. Yampolskiy

M. S. Spellchecker

127

25 Oct 2016

Safety Verification of Deep Neural Networks

Min Wu

685

985

21 Oct 2016

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural NetworksInternational Conference on Learning Representations (ICLR), 2016

Dan Hendrycks

Kevin Gimpel

UQCV

1.4K

3,921

07 Oct 2016

Learning Optimized Risk Scores

Berk Ustun

Cynthia Rudin

710

01 Oct 2016

Towards Verified Artificial Intelligence

Sanjit A. Seshia

Dorsa Sadigh

S. Shankar Sastry

234

204

27 Jun 2016