Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.19861
Cited By
Human Control: Definitions and Algorithms
31 May 2023
Ryan Carey
Tom Everitt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Human Control: Definitions and Algorithms"
6 / 6 papers shown
Title
Towards shutdownable agents via stochastic choice
Elliott Thornley
Alexander Roman
Christos Ziakas
Leyton Ho
Louis Thomson
35
0
0
30 Jun 2024
Open-Endedness is Essential for Artificial Superhuman Intelligence
Edward Hughes
Michael Dennis
Jack Parker-Holder
Feryal M. P. Behbahani
Aditi Mavalankar
Yuge Shi
Tom Schaul
Tim Rocktaschel
LRM
32
18
0
06 Jun 2024
Risks from Language Models for Automated Mental Healthcare: Ethics and Structure for Implementation
D. Grabb
Max Lamparth
N. Vasan
40
14
0
02 Apr 2024
Visibility into AI Agents
Alan Chan
Carson Ezell
Max Kaufmann
K. Wei
Lewis Hammond
...
Nitarshan Rajkumar
David M. Krueger
Noam Kolt
Lennart Heim
Markus Anderljung
13
31
0
23 Jan 2024
Quantifying stability of non-power-seeking in artificial agents
Evan Ryan Gunter
Yevgeny Liokumovich
Victoria Krakovna
23
1
0
07 Jan 2024
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
199
199
0
02 May 2018
1