Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.08705
Cited By
When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
28 January 2025
Xuan Chen
Yuzhou Nie
Wenbo Guo
Xiangyu Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search"
10 / 10 papers shown
Title
AgentXploit: End-to-End Redteaming of Black-Box AI Agents
Zhun Wang
Vincent Siu
Zhe Ye
Tianneng Shi
Yuzhou Nie
Xuandong Zhao
Chenguang Wang
Wenbo Guo
Dawn Song
LLMAG
AAML
22
0
0
09 May 2025
Frontier AI's Impact on the Cybersecurity Landscape
Wenbo Guo
Yujin Potter
Tianneng Shi
Zhun Wang
Andy Zhang
Dawn Song
31
1
0
07 Apr 2025
TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice
Aman Goel
Xian Carrie Wu
Zhe Wang
Dmitriy Bespalov
Yanjun Qi
41
0
0
21 Feb 2025
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
Jiahao Yu
Yangguang Shao
Hanwen Miao
Junzheng Shi
SILM
AAML
45
3
0
23 Sep 2024
PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach
Zhihao Lin
Wei Ma
Mingyi Zhou
Yanjie Zhao
Haoyu Wang
Yang Liu
Jun Wang
Li Li
AAML
22
1
0
21 Sep 2024
Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature
Tong Zhou
Xuandong Zhao
Xiaolin Xu
Shaolei Ren
16
6
0
04 Jun 2024
Play Guessing Game with LLM: Indirect Jailbreak Attack with Implicit Clues
Zhiyuan Chang
Mingyang Li
Yi Liu
Junjie Wang
Qing Wang
Yang Liu
78
13
0
14 Feb 2024
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Jinyuan Jia
Bill Yuchen Lin
Radha Poovendran
AAML
129
82
0
14 Feb 2024
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xing-ming Guo
Fangxu Yu
Huan Zhang
Lianhui Qin
Bin Hu
AAML
109
69
0
13 Feb 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
110
292
0
19 Sep 2023
1