ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.20099
  4. Cited By
Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs
  against Jailbreak Attacks

Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks

30 May 2024
Chen Xiong
Xiangyu Qi
Pin-Yu Chen
Tsung-Yi Ho
    AAML
ArXivPDFHTML

Papers citing "Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks"

6 / 6 papers shown
Title
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Siyuan Liang
Jiayang Liu
Jiecheng Zhai
Tianmeng Fang
Rongcheng Tu
A. Liu
Xiaochun Cao
Dacheng Tao
VGen
49
0
0
22 Apr 2025
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu
Yinghui He
Xinzhe Juan
Y. Wang
Y. Liu
Zixin Yao
Yue Wu
Xun Jiang
L. Yang
Mengdi Wang
AI4MH
65
0
0
13 Apr 2025
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
66
1
0
09 Oct 2024
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
Xunguang Wang
Daoyuan Wu
Zhenlan Ji
Zongjie Li
Pingchuan Ma
Shuai Wang
Yingjiu Li
Yang Liu
Ning Liu
Juergen Rahmel
AAML
66
6
0
08 Jun 2024
Summarization is (Almost) Dead
Summarization is (Almost) Dead
Xiao Pu
Mingqi Gao
Xiaojun Wan
HILM
65
38
0
18 Sep 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1