Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.21083
Cited By
Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring
28 October 2024
Honglin Mu
Han He
Yuxin Zhou
Yunlong Feng
Yang Xu
L. Qin
Xiaoming Shi
Zeming Liu
Xudong Han
Qi Shi
Qingfu Zhu
Wanxiang Che
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring"
1 / 1 papers shown
Title
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
Weixiang Zhao
Jiahe Guo
Yulin Hu
Yang Deng
An Zhang
...
Xinyang Han
Yanyan Zhao
Bing Qin
Tat-Seng Chua
Ting Liu
AAML
LLMSV
37
0
0
13 Apr 2025
1