Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.17458
Cited By
RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking
26 September 2024
Yifan Jiang
Kriti Aggarwal
Tanmay Laud
Kashif Munir
Jay Pujara
Subhabrata Mukherjee
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking"
9 / 9 papers shown
Title
Strategize Globally, Adapt Locally: A Multi-Turn Red Teaming Agent with Dual-Level Learning
S. Chen
Xiao Yu
Ninareh Mehrabi
Rahul Gupta
Zhou Yu
Ruoxi Jia
AAML
LLMAG
45
0
0
02 Apr 2025
Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search
Andy Zhou
MU
67
0
0
13 Mar 2025
Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models
Alberto Purpura
Sahil Wadhwa
Jesse Zymet
Akshay Gupta
Andy Luo
Melissa Kazemi Rad
Swapnil Shinde
Mohammad Sorower
AAML
73
0
0
03 Mar 2025
Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks
Hanjiang Hu
Alexander Robey
Changliu Liu
AAML
LLMSV
44
1
0
28 Feb 2025
Foot-In-The-Door: A Multi-turn Jailbreak for LLMs
Zixuan Weng
Xiaolong Jin
Jinyuan Jia
X. Zhang
AAML
43
0
0
27 Feb 2025
SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks
Hongye Cao
Yanming Wang
Sijia Jing
Ziyue Peng
Zhixin Bai
...
Yang Gao
Fanyu Meng
Xi Yang
Chao Deng
Junlan Feng
AAML
41
0
0
16 Feb 2025
Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models
Hao Yang
Lizhen Qu
Ehsan Shareghi
Gholamreza Haffari
AAML
34
1
0
15 Oct 2024
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders
David A. Noever
Forrest McKee
AAML
43
0
0
09 Oct 2024
You Know What I'm Saying: Jailbreak Attack via Implicit Reference
Tianyu Wu
Lingrui Mei
Ruibin Yuan
Lujun Li
Wei Xue
Yike Guo
35
1
0
04 Oct 2024
1