Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2407.09447
Cited By
v1
v2
v3
v4
v5 (latest)
ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts
12 July 2024
Amelia F. Hardy
Houjun Liu
Bernard Lange
Mykel J. Kochenderfer
Mykel J. Kochenderfer
Mykel J. Kochenderfer
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts"
1 / 1 papers shown
Characterizing the Robustness of Black-Box LLM Planners Under Perturbed Observations with Adaptive Stress Testing
Neeloy Chakraborty
John Pohovey
Melkior Ornik
Katherine Driggs-Campbell
437
0
0
08 May 2025
1
Page 1 of 1