Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.20773
Cited By
Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character
25 May 2024
Siyuan Ma
Weidi Luo
Yu Wang
Xiaogeng Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character"
31 / 31 papers shown
Title
REVEAL: Multi-turn Evaluation of Image-Input Harms for Vision LLM
Madhur Jindal
Saurabh Deshpande
AAML
40
0
0
07 May 2025
Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy
Joonhyun Jeong
Seyun Bae
Yeonsung Jung
Jaeryong Hwang
Eunho Yang
AAML
43
0
0
26 Mar 2025
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks
Wenhao You
Bryan Hooi
Yiwei Wang
Y. Wang
Zong Ke
Ming Yang
Zi Huang
Yujun Cai
AAML
54
0
0
24 Mar 2025
Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization
Shuyang Hao
Yiwei Wang
Bryan Hooi
J. Liu
Muhao Chen
Zi Huang
Yujun Cai
AAML
VLM
63
0
0
14 Mar 2025
Distraction is All You Need for Multimodal Large Language Model Jailbreaking
Zuopeng Yang
Jiluan Fan
Anli Yan
Erdun Gao
Xin Lin
Tao Li
Kanghua mo
Changyu Dong
AAML
70
0
0
15 Feb 2025
SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach
Ruoxi Sun
Jiamin Chang
Hammond Pearce
Chaowei Xiao
B. Li
Qi Wu
Surya Nepal
Minhui Xue
27
0
0
17 Nov 2024
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Xuannan Liu
Xing Cui
Peipei Li
Zekun Li
Huaibo Huang
Shuhan Xia
Miaoxuan Zhang
Yueying Zou
Ran He
AAML
51
4
0
14 Nov 2024
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Yingzi Ma
Jiongxiao Wang
Fei-Yue Wang
Siyuan Ma
Jiazhao Li
...
B. Li
Yejin Choi
M. Chen
Chaowei Xiao
Chaowei Xiao
MU
40
6
0
05 Nov 2024
You Know What I'm Saying: Jailbreak Attack via Implicit Reference
Tianyu Wu
Lingrui Mei
Ruibin Yuan
Lujun Li
Wei Xue
Yike Guo
25
1
0
04 Oct 2024
A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends
Daizong Liu
Mingyu Yang
Xiaoye Qu
Pan Zhou
Yu Cheng
Wei Hu
ELM
AAML
27
21
0
10 Jul 2024
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Haibo Jin
Leyang Hu
Xinuo Li
Peiyan Zhang
Chonghan Chen
Jun Zhuang
Haohan Wang
PILM
25
22
0
26 Jun 2024
JailBreakV-28K: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak Attacks
Weidi Luo
Siyuan Ma
Xiaogeng Liu
Xiaoyu Guo
Chaowei Xiao
AAML
58
17
0
03 Apr 2024
Exploring the Limits of Zero Shot Vision Language Models for Hate Meme Detection: The Vulnerabilities and their Interpretations
Naquee Rizwan
Paramananda Bhaskar
Mithun Das
Swadhin Satyaprakash Majhi
Punyajoy Saha
Animesh Mukherjee
VLM
16
2
0
19 Feb 2024
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Ye Wang
Jing Jiang
Min-Bin Lin
LLMAG
LM&Ro
35
47
0
13 Feb 2024
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
Yongshuo Zong
Ondrej Bohdal
Tingyang Yu
Yongxin Yang
Timothy M. Hospedales
VLM
MLLM
49
56
0
03 Feb 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
70
89
0
29 Jan 2024
Democratizing Fine-grained Visual Recognition with Large Language Models
Mingxuan Liu
Subhankar Roy
Wenjing Li
Zhun Zhong
N. Sebe
Elisa Ricci
VLM
14
10
0
24 Jan 2024
Red Teaming Visual Language Models
Mukai Li
Lei Li
Yuwei Yin
Masood Ahmed
Zhenguang Liu
Qi Liu
VLM
25
11
0
23 Jan 2024
PuriDefense: Randomized Local Implicit Adversarial Purification for Defending Black-box Query-based Attacks
Ping Guo
Zhiyuan Yang
Xi Lin
Qingchuan Zhao
Qingfu Zhang
AAML
17
4
0
19 Jan 2024
Silkie: Preference Distillation for Large Visual Language Models
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
Liang Chen
Yazheng Yang
Benyou Wang
Lingpeng Kong
MLLM
96
67
0
17 Dec 2023
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
M. Steyvers
Yuan Yao
Haoye Zhang
Taiwen He
Yifeng Han
...
Xinyue Hu
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
Tat-Seng Chua
MLLM
VLM
125
176
0
01 Dec 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLM
MLLM
182
576
0
16 Nov 2023
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
127
116
0
09 Nov 2023
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
135
139
0
16 Oct 2023
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li
Yulin Chen
Jinglong Luo
Yan Kang
Xiaojin Zhang
Qi Hu
Chunkit Chan
Yangqiu Song
PILM
27
39
0
16 Oct 2023
On the Adversarial Robustness of Multi-Modal Foundation Models
Christian Schlarmann
Matthias Hein
AAML
90
84
0
21 Aug 2023
Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models
Jimmy Wei
Kurt Shuster
Arthur Szlam
Jason Weston
Jack Urbanek
M. Komeili
LLMAG
39
38
0
26 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
Diffusion Models for Adversarial Purification
Weili Nie
Brandon Guo
Yujia Huang
Chaowei Xiao
Arash Vahdat
Anima Anandkumar
WIGM
184
410
0
16 May 2022
Adversarial Machine Learning at Scale
Alexey Kurakin
Ian Goodfellow
Samy Bengio
AAML
253
3,102
0
04 Nov 2016
1