ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.02532
  4. Cited By
Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a
  Multi-agent Attacker-Disguiser Game

Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

3 April 2024
Qianqiao Xu
Zhiliang Tian
Hongyan Wu
Zhen Huang
Yiping Song
Feng Liu
Dongsheng Li
    LLMAG
    AAML
ArXivPDFHTML

Papers citing "Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game"

3 / 3 papers shown
Title
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Julien Piet
Maha Alrashed
Chawin Sitawarin
Sizhe Chen
Zeming Wei
Elizabeth Sun
Basel Alomair
David A. Wagner
AAML
SyDa
75
52
0
29 Dec 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
286
2,232
0
22 Mar 2023
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
1