Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.14023
Cited By
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response
22 May 2024
Tianrong Zhang
Bochuan Cao
Yuanpu Cao
Lu Lin
Prasenjit Mitra
Jinghui Chen
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response"
7 / 7 papers shown
Title
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
0
0
26 Apr 2025
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu
Yinghui He
Xinzhe Juan
Y. Wang
Y. Liu
Zixin Yao
Yue Wu
Xun Jiang
L. Yang
Mengdi Wang
AI4MH
60
0
0
13 Apr 2025
Robust LLM safeguarding via refusal feature adversarial training
L. Yu
Virginie Do
Karen Hambardzumyan
Nicola Cancedda
AAML
42
9
0
30 Sep 2024
Knowledge Return Oriented Prompting (KROP)
Jason Martin
Kenneth Yeung
20
0
0
11 Jun 2024
When "Competency" in Reasoning Opens the Door to Vulnerability: Jailbreaking LLMs via Novel Complex Ciphers
Divij Handa
Advait Chirmule
Bimal Gajera
Chitta Baral
Chitta Baral
39
18
0
16 Feb 2024
Play Guessing Game with LLM: Indirect Jailbreak Attack with Implicit Clues
Zhiyuan Chang
Mingyang Li
Yi Liu
Junjie Wang
Qing Wang
Yang Liu
84
37
0
14 Feb 2024
On the Safety of Open-Sourced Large Language Models: Does Alignment Really Prevent Them From Being Misused?
Hangfan Zhang
Zhimeng Guo
Huaisheng Zhu
Bochuan Cao
Lu Lin
Jinyuan Jia
Jinghui Chen
Di Wu
60
23
0
02 Oct 2023
1