ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.11182
  4. Cited By
Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles
v1v2 (latest)

Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles

20 August 2024
Zhilong Wang
Haizhou Wang
Nanqing Luo
Lan Zhang
Xiaoyan Sun
Yebo Cao
Peng Liu
ArXiv (abs)PDFHTML

Papers citing "Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles"

1 / 1 papers shown
Title
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack DefenseNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Yang Ouyang
Hengrui Gu
Shuhang Lin
Qingfeng Lan
Jie Peng
B. Kailkhura
Tianlong Chen
Kaixiong Zhou
Kaixiong Zhou
AAML
267
7
0
05 Jan 2025
1