Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2209.02167
Cited By
v1
v2
v3 (latest)
Red Teaming with Mind Reading: White-Box Adversarial Policies Against RL Agents
5 September 2022
Stephen Casper
Taylor Killian
Gabriel Kreiman
Dylan Hadfield-Menell
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Github (1★)
Papers citing
"Red Teaming with Mind Reading: White-Box Adversarial Policies Against RL Agents"
1 / 1 papers shown
Black-Box Access is Insufficient for Rigorous AI Audits
Conference on Fairness, Accountability and Transparency (FAccT), 2024
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
560
133
0
25 Jan 2024
1