Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.03857
Cited By
You Know What I'm Saying: Jailbreak Attack via Implicit Reference
4 October 2024
Tianyu Wu
Lingrui Mei
Ruibin Yuan
Lujun Li
Wei Xue
Yike Guo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"You Know What I'm Saying: Jailbreak Attack via Implicit Reference"
2 / 2 papers shown
Title
RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking
Yifan Jiang
Kriti Aggarwal
Tanmay Laud
Kashif Munir
Jay Pujara
Subhabrata Mukherjee
AAML
38
10
0
26 Sep 2024
PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach
Zhihao Lin
Wei Ma
Mingyi Zhou
Yanjie Zhao
Haoyu Wang
Yang Liu
Jun Wang
Li Li
AAML
30
5
0
21 Sep 2024
1