Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.09093
Cited By
BaThe: Defense against the Jailbreak Attack in Multimodal Large Language Models by Treating Harmful Instruction as Backdoor Trigger
17 August 2024
Yulin Chen
Haoran Li
Zihao Zheng
Zihao Zheng
Yangqiu Song
Bryan Hooi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BaThe: Defense against the Jailbreak Attack in Multimodal Large Language Models by Treating Harmful Instruction as Backdoor Trigger"
7 / 7 papers shown
Title
Distraction is All You Need for Multimodal Large Language Model Jailbreaking
Zuopeng Yang
Jiluan Fan
Anli Yan
Erdun Gao
Xin Lin
Tao Li
Kanghua mo
Changyu Dong
AAML
70
0
0
15 Feb 2025
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Xuannan Liu
Xing Cui
Peipei Li
Zekun Li
Huaibo Huang
Shuhan Xia
Miaoxuan Zhang
Yueying Zou
Ran He
AAML
51
4
0
14 Nov 2024
Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Yunkai Dang
Mengxi Gao
Yibo Yan
Xin Zou
Yanggan Gu
Aiwei Liu
Xuming Hu
34
4
0
05 Nov 2024
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
127
116
0
09 Nov 2023
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li
Yulin Chen
Jinglong Luo
Yan Kang
Xiaojin Zhang
Qi Hu
Chunkit Chan
Yangqiu Song
PILM
19
39
0
16 Oct 2023
On the Adversarial Robustness of Multi-Modal Foundation Models
Christian Schlarmann
Matthias Hein
AAML
90
45
0
21 Aug 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1