Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.02532
Cited By
Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game
3 April 2024
Qianqiao Xu
Zhiliang Tian
Hongyan Wu
Zhen Huang
Yiping Song
Feng Liu
Dongsheng Li
LLMAG
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game"
3 / 3 papers shown
Title
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Julien Piet
Maha Alrashed
Chawin Sitawarin
Sizhe Chen
Zeming Wei
Elizabeth Sun
Basel Alomair
David A. Wagner
AAML
SyDa
75
52
0
29 Dec 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
286
2,232
0
22 Mar 2023
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
1