ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.06565
  4. Cited By
Concrete Problems in AI Safety
v1v2 (latest)

Concrete Problems in AI Safety

21 June 2016
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
ArXiv (abs)PDFHTML

Papers citing "Concrete Problems in AI Safety"

25 / 1,375 papers shown
Title
Reinforcement Learning with a Corrupted Reward Channel
Reinforcement Learning with a Corrupted Reward Channel
Tom Everitt
Victoria Krakovna
Laurent Orseau
Marcus Hutter
Shane Legg
294
116
0
23 May 2017
Concrete Dropout
Concrete Dropout
Y. Gal
Jiri Hron
Alex Kendall
BDLUQCV
561
638
0
22 May 2017
Ensemble Adversarial Training: Attacks and Defenses
Ensemble Adversarial Training: Attacks and Defenses
Florian Tramèr
Alexey Kurakin
Nicolas Papernot
Ian Goodfellow
Dan Boneh
Patrick McDaniel
AAML
430
2,924
0
19 May 2017
Repeated Inverse Reinforcement Learning
Repeated Inverse Reinforcement Learning
Kareem Amin
Nan Jiang
Satinder Singh
299
78
0
15 May 2017
Probabilistically Safe Policy Transfer
Probabilistically Safe Policy Transfer
David Held
Zoe McCarthy
Michael Zhang
Fred Shentu
Pieter Abbeel
145
20
0
15 May 2017
Maximum Resilience of Artificial Neural Networks
Maximum Resilience of Artificial Neural Networks
Chih-Hong Cheng
Georg Nührenberg
Harald Ruess
AAML
377
296
0
28 Apr 2017
Google's Cloud Vision API Is Not Robust To Noise
Google's Cloud Vision API Is Not Robust To Noise
Hossein Hosseini
Baicen Xiao
Radha Poovendran
AAML
162
128
0
16 Apr 2017
Enter the Matrix: Safely Interruptible Autonomous Systems via
  Virtualization
Enter the Matrix: Safely Interruptible Autonomous Systems via Virtualization
Mark O. Riedl
Brent Harrison
91
7
0
30 Mar 2017
Deceiving Google's Cloud Video Intelligence API Built for Summarizing
  Videos
Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos
Hossein Hosseini
Baicen Xiao
Radha Poovendran
AAML
104
18
0
26 Mar 2017
Blocking Transferability of Adversarial Examples in Black-Box Learning
  Systems
Blocking Transferability of Adversarial Examples in Black-Box Learning Systems
Hossein Hosseini
Yize Chen
Sreeram Kannan
Baosen Zhang
Radha Poovendran
AAML
147
110
0
13 Mar 2017
Dropout Inference in Bayesian Neural Networks with Alpha-divergences
Dropout Inference in Bayesian Neural Networks with Alpha-divergences
Yingzhen Li
Y. Gal
UQCVBDL
213
204
0
08 Mar 2017
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAIFaML
657
4,450
0
28 Feb 2017
Strongly-Typed Agents are Guaranteed to Interact Safely
Strongly-Typed Agents are Guaranteed to Interact SafelyInternational Conference on Machine Learning (ICML), 2017
David Balduzzi
180
2
0
24 Feb 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRLVLM
835
1,727
0
25 Jan 2017
Stoic Ethics for Artificial Agents
Stoic Ethics for Artificial Agents
Gabriel Murray
104
8
0
09 Jan 2017
Reinforcement Learning With Temporal Logic Rewards
Reinforcement Learning With Temporal Logic Rewards
Xiao Li
C. Vasile
C. Belta
221
243
0
11 Dec 2016
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCVBDL
1.4K
6,694
0
05 Dec 2016
Generalizing Skills with Semi-Supervised Reinforcement Learning
Generalizing Skills with Semi-Supervised Reinforcement Learning
Chelsea Finn
Tianhe Yu
Justin Fu
Pieter Abbeel
Sergey Levine
OffRLSSL
182
70
0
01 Dec 2016
On Human Intellect and Machine Failures: Troubleshooting Integrative
  Machine Learning Systems
On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems
Besmira Nushi
Ece Kamar
Eric Horvitz
Donald Kossmann
165
82
0
24 Nov 2016
Towards the Science of Security and Privacy in Machine Learning
Towards the Science of Security and Privacy in Machine Learning
Nicolas Papernot
Patrick McDaniel
Arunesh Sinha
Michael P. Wellman
AAML
204
489
0
11 Nov 2016
Artificial Intelligence Safety and Cybersecurity: a Timeline of AI
  Failures
Artificial Intelligence Safety and Cybersecurity: a Timeline of AI Failures
Roman V. Yampolskiy
M. S. Spellchecker
119
93
0
25 Oct 2016
Safety Verification of Deep Neural Networks
Safety Verification of Deep Neural Networks
Xiaowei Huang
Marta Kwiatkowska
Sen Wang
Min Wu
AAML
629
978
0
21 Oct 2016
A Baseline for Detecting Misclassified and Out-of-Distribution Examples
  in Neural Networks
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural NetworksInternational Conference on Learning Representations (ICLR), 2016
Dan Hendrycks
Kevin Gimpel
UQCV
1.4K
3,888
0
07 Oct 2016
Learning Optimized Risk Scores
Learning Optimized Risk Scores
Berk Ustun
Cynthia Rudin
651
95
0
01 Oct 2016
Towards Verified Artificial Intelligence
Towards Verified Artificial Intelligence
Sanjit A. Seshia
Dorsa Sadigh
S. Shankar Sastry
214
204
0
27 Jun 2016
Previous
123...262728