Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.15727
Cited By
LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper
24 February 2024
Daoyuan Wu
Shuaibao Wang
Yang Liu
Ning Liu
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper"
6 / 6 papers shown
Title
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
55
1
0
09 Oct 2024
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Anton Xue
Avishree Khare
Rajeev Alur
Surbhi Goel
Eric Wong
34
2
0
21 Jun 2024
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
Xunguang Wang
Daoyuan Wu
Zhenlan Ji
Zongjie Li
Pingchuan Ma
Shuai Wang
Yingjiu Li
Yang Liu
Ning Liu
Juergen Rahmel
AAML
48
6
0
08 Jun 2024
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Jinyuan Jia
Bill Yuchen Lin
Radha Poovendran
AAML
129
82
0
14 Feb 2024
Fight Back Against Jailbreaking via Prompt Adversarial Tuning
Yichuan Mo
Yuji Wang
Zeming Wei
Yisen Wang
AAML
SILM
36
11
0
09 Feb 2024
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
Yuqiang Sun
Daoyuan Wu
Yue Xue
Han Liu
Wei Ma
Lyuye Zhang
Miaolei Shi
Yingjiu Li
ELM
76
46
0
29 Jan 2024
1