Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.10462
Cited By
Safety Cases: How to Justify the Safety of Advanced AI Systems
15 March 2024
Joshua Clymer
Nick Gabrieli
David Krueger
Thomas Larsen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Safety Cases: How to Justify the Safety of Advanced AI Systems"
5 / 5 papers shown
Title
Reasoning Models Don't Always Say What They Think
Yanda Chen
Joe Benton
Ansh Radhakrishnan
Jonathan Uesato
Carson E. Denison
...
Vlad Mikulik
Samuel R. Bowman
Jan Leike
Jared Kaplan
E. Perez
ReLM
LRM
62
7
1
08 May 2025
An alignment safety case sketch based on debate
Marie Davidsen Buhl
Jacob Pfau
Benjamin Hilton
Geoffrey Irving
19
0
0
06 May 2025
A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management
Simeon Campos
Henry Papadatos
Fabien Roger
Chloé Touzet
Malcolm Murray
Otter Quarks
73
2
0
20 Feb 2025
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities
Zora Che
Stephen Casper
Robert Kirk
Anirudh Satheesh
Stewart Slocum
...
Zikui Cai
Bilal Chughtai
Y. Gal
Furong Huang
Dylan Hadfield-Menell
MU
AAML
ELM
74
2
0
03 Feb 2025
Societal Adaptation to Advanced AI
Jamie Bernardi
Gabriel Mukobi
Hilary Greaves
Lennart Heim
Markus Anderljung
40
4
0
16 May 2024
1