
![]() Beyond Preferences in AI AlignmentPhilosophical Studies (Philos. Stud.), 2024 |
![]() Human Control: Definitions and AlgorithmsConference on Uncertainty in Artificial Intelligence (UAI), 2023 |
![]() AGI Agent Safety by Iteratively Improving the Utility FunctionArtificial General Intelligence (AGI), 2020 |