Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.05284
Cited By
Can a Bayesian Oracle Prevent Harm from an Agent?
9 August 2024
Yoshua Bengio
Michael K. Cohen
Nikolay Malkin
Matt MacDermott
Damiano Fornasiere
Pietro Greiner
Younesse Kaddar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can a Bayesian Oracle Prevent Harm from an Agent?"
5 / 5 papers shown
Title
Shh, don't say that! Domain Certification in LLMs
Cornelius Emde
Alasdair Paren
Preetham Arvind
Maxime Kayser
Tom Rainforth
Thomas Lukasiewicz
Bernard Ghanem
Philip H. S. Torr
Adel Bibi
45
1
0
26 Feb 2025
Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity
David Williams-King
Linh Le
Adam Oberman
Yoshua Bengio
AAML
46
0
0
19 Jan 2025
Amortizing intractable inference in diffusion models for vision, language, and control
S. Venkatraman
Moksh Jain
Luca Scimeca
Minsu Kim
Marcin Sendera
...
Alexandre Adam
Jarrid Rector-Brooks
Yoshua Bengio
Glen Berseth
Nikolay Malkin
57
24
0
31 May 2024
Improved off-policy training of diffusion samplers
Marcin Sendera
Minsu Kim
Sarthak Mittal
Pablo Lemos
Luca Scimeca
Jarrid Rector-Brooks
Alexandre Adam
Yoshua Bengio
Nikolay Malkin
OffRL
59
16
0
07 Feb 2024
A detailed treatment of Doob's theorem
Jeffrey W. Miller
24
28
0
09 Jan 2018
1