ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.02167
  4. Cited By
Red Teaming with Mind Reading: White-Box Adversarial Policies Against RL
  Agents

Red Teaming with Mind Reading: White-Box Adversarial Policies Against RL Agents

5 September 2022
Stephen Casper
Taylor Killian
Gabriel Kreiman
Dylan Hadfield-Menell
    AAML
ArXivPDFHTML

Papers citing "Red Teaming with Mind Reading: White-Box Adversarial Policies Against RL Agents"

2 / 2 papers shown
Title
Black-Box Access is Insufficient for Rigorous AI Audits
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
13
76
0
25 Jan 2024
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
221
436
0
25 Sep 2019
1