ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.06565
  4. Cited By
Concrete Problems in AI Safety
v1v2 (latest)

Concrete Problems in AI Safety

21 June 2016
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
ArXiv (abs)PDFHTML

Papers citing "Concrete Problems in AI Safety"

29 / 1,379 papers shown
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferencesNeural Information Processing Systems (NeurIPS), 2017
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
1.6K
4,461
0
12 Jun 2017
Enhancing The Reliability of Out-of-distribution Image Detection in
  Neural Networks
Enhancing The Reliability of Out-of-distribution Image Detection in Neural NetworksInternational Conference on Learning Representations (ICLR), 2017
Shiyu Liang
Shouqing Yang
R. Srikant
UQCVOODD
1.1K
2,317
0
08 Jun 2017
Constrained Policy Optimization
Constrained Policy OptimizationInternational Conference on Machine Learning (ICML), 2017
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
1.4K
1,588
0
30 May 2017
Safe Model-based Reinforcement Learning with Stability Guarantees
Safe Model-based Reinforcement Learning with Stability Guarantees
Felix Berkenkamp
M. Turchetta
Angela P. Schoellig
Andreas Krause
554
930
0
23 May 2017
Reinforcement Learning with a Corrupted Reward Channel
Reinforcement Learning with a Corrupted Reward Channel
Tom Everitt
Victoria Krakovna
Laurent Orseau
Marcus Hutter
Shane Legg
329
118
0
23 May 2017
Concrete Dropout
Concrete Dropout
Y. Gal
Jiri Hron
Alex Kendall
BDLUQCV
639
645
0
22 May 2017
Ensemble Adversarial Training: Attacks and Defenses
Ensemble Adversarial Training: Attacks and Defenses
Florian Tramèr
Alexey Kurakin
Nicolas Papernot
Ian Goodfellow
Dan Boneh
Patrick McDaniel
AAML
510
2,944
0
19 May 2017
Repeated Inverse Reinforcement Learning
Repeated Inverse Reinforcement Learning
Kareem Amin
Nan Jiang
Satinder Singh
339
78
0
15 May 2017
Probabilistically Safe Policy Transfer
Probabilistically Safe Policy Transfer
David Held
Zoe McCarthy
Michael Zhang
Fred Shentu
Pieter Abbeel
153
20
0
15 May 2017
Maximum Resilience of Artificial Neural Networks
Maximum Resilience of Artificial Neural Networks
Chih-Hong Cheng
Georg Nührenberg
Harald Ruess
AAML
395
298
0
28 Apr 2017
Google's Cloud Vision API Is Not Robust To Noise
Google's Cloud Vision API Is Not Robust To Noise
Hossein Hosseini
Baicen Xiao
Radha Poovendran
AAML
179
129
0
16 Apr 2017
Enter the Matrix: Safely Interruptible Autonomous Systems via
  Virtualization
Enter the Matrix: Safely Interruptible Autonomous Systems via Virtualization
Mark O. Riedl
Brent Harrison
91
7
0
30 Mar 2017
Deceiving Google's Cloud Video Intelligence API Built for Summarizing
  Videos
Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos
Hossein Hosseini
Baicen Xiao
Radha Poovendran
AAML
110
18
0
26 Mar 2017
Blocking Transferability of Adversarial Examples in Black-Box Learning
  Systems
Blocking Transferability of Adversarial Examples in Black-Box Learning Systems
Hossein Hosseini
Yize Chen
Sreeram Kannan
Baosen Zhang
Radha Poovendran
AAML
156
111
0
13 Mar 2017
Dropout Inference in Bayesian Neural Networks with Alpha-divergences
Dropout Inference in Bayesian Neural Networks with Alpha-divergences
Yingzhen Li
Y. Gal
UQCVBDL
230
206
0
08 Mar 2017
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAIFaML
701
4,555
0
28 Feb 2017
Strongly-Typed Agents are Guaranteed to Interact Safely
Strongly-Typed Agents are Guaranteed to Interact SafelyInternational Conference on Machine Learning (ICML), 2017
David Balduzzi
225
2
0
24 Feb 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRLVLM
959
1,742
0
25 Jan 2017
Stoic Ethics for Artificial Agents
Stoic Ethics for Artificial Agents
Gabriel Murray
112
8
0
09 Jan 2017
Reinforcement Learning With Temporal Logic Rewards
Reinforcement Learning With Temporal Logic Rewards
Xiao Li
C. Vasile
C. Belta
248
246
0
11 Dec 2016
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCVBDL
1.5K
6,801
0
05 Dec 2016
Generalizing Skills with Semi-Supervised Reinforcement Learning
Generalizing Skills with Semi-Supervised Reinforcement Learning
Chelsea Finn
Tianhe Yu
Justin Fu
Pieter Abbeel
Sergey Levine
OffRLSSL
198
70
0
01 Dec 2016
On Human Intellect and Machine Failures: Troubleshooting Integrative
  Machine Learning Systems
On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems
Besmira Nushi
Ece Kamar
Eric Horvitz
Donald Kossmann
173
82
0
24 Nov 2016
Towards the Science of Security and Privacy in Machine Learning
Towards the Science of Security and Privacy in Machine Learning
Nicolas Papernot
Patrick McDaniel
Arunesh Sinha
Michael P. Wellman
AAML
235
493
0
11 Nov 2016
Artificial Intelligence Safety and Cybersecurity: a Timeline of AI
  Failures
Artificial Intelligence Safety and Cybersecurity: a Timeline of AI Failures
Roman V. Yampolskiy
M. S. Spellchecker
127
96
0
25 Oct 2016
Safety Verification of Deep Neural Networks
Safety Verification of Deep Neural Networks
Xiaowei Huang
Marta Kwiatkowska
Sen Wang
Min Wu
AAML
685
985
0
21 Oct 2016
A Baseline for Detecting Misclassified and Out-of-Distribution Examples
  in Neural Networks
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural NetworksInternational Conference on Learning Representations (ICLR), 2016
Dan Hendrycks
Kevin Gimpel
UQCV
1.4K
3,921
0
07 Oct 2016
Learning Optimized Risk Scores
Learning Optimized Risk Scores
Berk Ustun
Cynthia Rudin
710
96
0
01 Oct 2016
Towards Verified Artificial Intelligence
Towards Verified Artificial Intelligence
Sanjit A. Seshia
Dorsa Sadigh
S. Shankar Sastry
234
204
0
27 Jun 2016
Previous
123...262728
Page 28 of 28
Pageof 28