Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.01294
Cited By
Endless Jailbreaks with Bijection Learning
2 October 2024
Brian R. Y. Huang
Maximilian Li
Leonard Tang
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Endless Jailbreaks with Bijection Learning"
4 / 4 papers shown
Title
A generative approach to LLM harmfulness detection with special red flag tokens
Sophie Xhonneux
David Dobre
Mehrnaz Mohfakhami
Leo Schwinn
Gauthier Gidel
43
1
0
22 Feb 2025
Jailbreaking to Jailbreak
Jeremy Kritz
Vaughn Robinson
Robert Vacareanu
Bijan Varjavand
Michael Choi
Bobby Gogov
Scale Red Team
Summer Yue
Willow Primack
Zifan Wang
93
0
0
09 Feb 2025
Plentiful Jailbreaks with String Compositions
Brian R. Y. Huang
AAML
36
2
0
01 Nov 2024
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
AAML
58
155
0
02 Apr 2024
1