ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.21083
  4. Cited By
Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring

Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring

28 October 2024
Honglin Mu
Han He
Yuxin Zhou
Yunlong Feng
Yang Xu
L. Qin
Xiaoming Shi
Zeming Liu
Xudong Han
Qi Shi
Qingfu Zhu
Wanxiang Che
    AAML
ArXivPDFHTML

Papers citing "Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring"

1 / 1 papers shown
Title
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
Weixiang Zhao
Jiahe Guo
Yulin Hu
Yang Deng
An Zhang
...
Xinyang Han
Yanyan Zhao
Bing Qin
Tat-Seng Chua
Ting Liu
AAML
LLMSV
37
0
0
13 Apr 2025
1