ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.14539
  4. Cited By
Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal
  Language Models
v1v2 (latest)

Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models

International Conference on Learning Representations (ICLR), 2023
26 July 2023
Erfan Shayegani
Yue Dong
Nael B. Abu-Ghazaleh
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models"

50 / 161 papers shown
Jailbreak Attacks and Defenses against Multimodal Generative Models: A
  Survey
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Xuannan Liu
Xing Cui
Peipei Li
Zekun Li
Huaibo Huang
Shuhan Xia
Miaoxuan Zhang
Yueying Zou
Ran He
AAML
538
24
0
14 Nov 2024
New Emerged Security and Privacy of Pre-trained Model: a Survey and
  Outlook
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Meng Yang
Tianqing Zhu
Chi Liu
Wanlei Zhou
Shui Yu
Philip S. Yu
AAMLELMPILM
309
2
0
12 Nov 2024
Layer-wise Alignment: Examining Safety Alignment Across Image Encoder Layers in Vision Language Models
Layer-wise Alignment: Examining Safety Alignment Across Image Encoder Layers in Vision Language Models
Saketh Bachu
Erfan Shayegani
Trishna Chakraborty
Rohit Lal
Arindam Dutta
Chengyu Song
Yue Dong
Nael B. Abu-Ghazaleh
Amit K. Roy-Chowdhury
278
0
0
06 Nov 2024
UniGuard: Towards Universal Safety Guardrails for Jailbreak Attacks on Multimodal Large Language Models
UniGuard: Towards Universal Safety Guardrails for Jailbreak Attacks on Multimodal Large Language Models
Sejoon Oh
Yiqiao Jin
Megha Sharma
Donghyun Kim
Eric Ma
Gaurav Verma
Srijan Kumar
342
12
0
03 Nov 2024
Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models
Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models
Hao Yang
Zhuang Li
Ehsan Shareghi
Gholamreza Haffari
AAML
236
16
0
31 Oct 2024
Effective and Efficient Adversarial Detection for Vision-Language Models
  via A Single Vector
Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector
Youcheng Huang
Fengbin Zhu
Jingkun Tang
Pan Zhou
Wenqiang Lei
Jiancheng Lv
Tat-Seng Chua
AAML
164
5
0
30 Oct 2024
CLEAR: Character Unlearning in Textual and Visual Modalities
CLEAR: Character Unlearning in Textual and Visual ModalitiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Alexey Dontsov
Dmitrii Korzh
Alexey Zhavoronkin
Boris Mikheev
Denis Bobkov
Aibek Alanov
Oleg Y. Rogov
Ivan Oseledets
Elena Tutubalina
MUAILawVLM
527
13
0
23 Oct 2024
Bayesian scaling laws for in-context learning
Bayesian scaling laws for in-context learning
Aryaman Arora
Dan Jurafsky
Christopher Potts
Noah D. Goodman
513
11
0
21 Oct 2024
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal
  from Images
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images
Arka Daw
Megan Hong-Thanh Chung
Maria Mahbub
Amir Sadovnik
AAML
261
0
0
16 Oct 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video GenerationInternational Conference on Learning Representations (ICLR), 2024
Jaehong Yoon
Shoubin Yu
Vaidehi Patil
Huaxiu Yao
Joey Tianyi Zhou
674
56
0
16 Oct 2024
Break the Visual Perception: Adversarial Attacks Targeting Encoded
  Visual Tokens of Large Vision-Language Models
Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language ModelsACM Multimedia (MM), 2024
Yubo Wang
Chaohu Liu
Yanqiu Qu
Haoyu Cao
Deqiang Jiang
Linli Xu
MLLMAAML
154
15
0
09 Oct 2024
You Know What I'm Saying: Jailbreak Attack via Implicit Reference
You Know What I'm Saying: Jailbreak Attack via Implicit Reference
Tianyu Wu
Lingrui Mei
Ruibin Yuan
Lujun Li
Wei Xue
Yike Guo
223
11
0
04 Oct 2024
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
Wenxuan Wang
Kuiyi Gao
Zihan Jia
Youliang Yuan
Shu Yang
S. Wang
Wenxiang Jiao
Zhaopeng Tu
843
7
0
04 Oct 2024
FlipAttack: Jailbreak LLMs via Flipping
FlipAttack: Jailbreak LLMs via Flipping
Yue Liu
Xiaoxin He
Miao Xiong
Jinlan Fu
Shumin Deng
Bryan Hooi
AAML
244
41
0
02 Oct 2024
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data
Xuefeng Du
Reshmi Ghosh
Robert Sim
Ahmed Salem
Vitor Carvalho
Emily Lawton
Yixuan Li
Jack W. Stokes
VLMAAML
222
16
0
01 Oct 2024
Multimodal Pragmatic Jailbreak on Text-to-image Models
Multimodal Pragmatic Jailbreak on Text-to-image ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Tong Liu
Zhixin Lai
Jiawen Wang
Gengyuan Zhang
Shuo Chen
Juil Sock
Vera Demberg
Volker Tresp
Jindong Gu
313
10
0
27 Sep 2024
CoCA: Regaining Safety-awareness of Multimodal Large Language Models
  with Constitutional Calibration
CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Lei Li
Renjie Pi
Tianyang Han
Han Wu
Lanqing Hong
Lingpeng Kong
Xin Jiang
Zhenguo Li
308
19
0
17 Sep 2024
Building and better understanding vision-language models: insights and
  future directions
Building and better understanding vision-language models: insights and future directions
Hugo Laurençon
Andrés Marafioti
Victor Sanh
Léo Tronchon
VLM
317
132
0
22 Aug 2024
$\textit{MMJ-Bench}$: A Comprehensive Study on Jailbreak Attacks and
  Defenses for Vision Language Models
MMJ-Bench\textit{MMJ-Bench}MMJ-Bench: A Comprehensive Study on Jailbreak Attacks and Defenses for Vision Language Models
Fenghua Weng
Yue Xu
Chengyan Fu
Wenjie Wang
AAML
233
1
0
16 Aug 2024
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Mission Impossible: A Statistical Perspective on Jailbreaking LLMsNeural Information Processing Systems (NeurIPS), 2024
Jingtong Su
Mingyu Lee
SangKeun Lee
210
22
0
02 Aug 2024
Defending Jailbreak Attack in VLMs via Cross-modality Information
  Detector
Defending Jailbreak Attack in VLMs via Cross-modality Information Detector
Yue Xu
Xiuyuan Qi
Zhan Qin
Wenjie Wang
AAML
246
6
0
31 Jul 2024
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies
The Emerged Security and Privacy of LLM Agent: A Survey with Case StudiesACM Computing Surveys (ACM CSUR), 2024
Feng He
Tianqing Zhu
Dayong Ye
Bo Liu
Wanlei Zhou
Philip S. Yu
PILMLLMAGELM
462
77
0
28 Jul 2024
A Survey of Attacks on Large Vision-Language Models: Resources,
  Advances, and Future Trends
A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends
Daizong Liu
Mingyu Yang
Xiaoye Qu
Pan Zhou
Yu Cheng
Wei Hu
ELMAAML
344
73
0
10 Jul 2024
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Haibo Jin
Leyang Hu
Xinuo Li
Peiyan Zhang
Chonghan Chen
Jun Zhuang
Haohan Wang
PILM
421
60
0
26 Jun 2024
From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
Siyuan Wang
Zhuohan Long
Zhihao Fan
Zhongyu Wei
220
21
0
21 Jun 2024
"Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jailbreak
"Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jailbreak
Lingrui Mei
Shenghua Liu
Yiwei Wang
Baolong Bi
Jiayi Mao
Xueqi Cheng
AAML
208
18
0
17 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
518
60
0
17 Jun 2024
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
Zhao Xu
Fan Liu
Hao Liu
AAML
274
31
0
13 Jun 2024
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models
Delong Ran
Jinyuan Liu
Yichen Gong
Jingyi Zheng
Xinlei He
Tianshuo Cong
Anyu Wang
ELM
482
23
0
13 Jun 2024
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
Fan Liu
Zhao Xu
Hao Liu
AAML
258
25
0
07 Jun 2024
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
Zonghao Ying
Aishan Liu
Tianyuan Zhang
Zhengmin Yu
Yaning Tan
Xianglong Liu
Dacheng Tao
AAML
386
77
0
06 Jun 2024
Adversarial Attacks on Both Face Recognition and Face Anti-spoofing Models
Adversarial Attacks on Both Face Recognition and Face Anti-spoofing Models
Fengfan Zhou
Qianyu Zhou
Hefei Ling
Xuequan Lu
AAML
477
3
0
27 May 2024
Cross-Modal Safety Alignment: Is textual unlearning all you need?
Cross-Modal Safety Alignment: Is textual unlearning all you need?
Trishna Chakraborty
Erfan Shayegani
Zikui Cai
Nael B. Abu-Ghazaleh
M. Salman Asif
Yue Dong
Amit K. Roy-Chowdhury
Chengyu Song
242
23
0
27 May 2024
Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language
  Models via Role-playing Image Character
Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character
Siyuan Ma
Weidi Luo
Yu Wang
Xiaogeng Liu
364
56
0
25 May 2024
Safeguarding Vision-Language Models Against Patched Visual Prompt
  Injectors
Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors
Jiachen Sun
Changsheng Wang
Zhenghao Hu
Yiwei Zhang
Chaowei Xiao
AAMLVLM
228
13
0
17 May 2024
What matters when building vision-language models?
What matters when building vision-language models?Neural Information Processing Systems (NeurIPS), 2024
Hugo Laurençon
Léo Tronchon
Matthieu Cord
Victor Sanh
VLM
302
276
0
03 May 2024
JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models
JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models
Yingchaojie Feng
Zhizhang Chen
Zhining Kang
Sijia Wang
Haoyu Tian
Wei Zhang
Minfeng Zhu
Wei Chen
339
8
0
12 Apr 2024
Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in
  Multimodal Large Language Model Security
Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security
Yihe Fan
Yuxin Cao
Ziyu Zhao
Ziyao Liu
Shaofeng Li
226
22
0
08 Apr 2024
As Firm As Their Foundations: Can open-sourced foundation models be used
  to create adversarial examples for downstream tasks?
As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?
Anjun Hu
Jindong Gu
Francesco Pinto
Konstantinos Kamnitsas
Juil Sock
AAMLSILM
257
9
0
19 Mar 2024
EasyJailbreak: A Unified Framework for Jailbreaking Large Language
  Models
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Weikang Zhou
Xiao Wang
Limao Xiong
Han Xia
Yingshuang Gu
...
Lijun Li
Jing Shao
Tao Gui
Tao Gui
Xuanjing Huang
231
55
0
18 Mar 2024
Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models
Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language ModelsEuropean Conference on Computer Vision (ECCV), 2024
Yifan Li
Hangyu Guo
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
497
93
0
14 Mar 2024
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
Xijia Tao
Shuai Zhong
Lei Li
Qi Liu
Lingpeng Kong
393
46
0
05 Mar 2024
Accelerating Greedy Coordinate Gradient via Probe Sampling
Accelerating Greedy Coordinate Gradient via Probe Sampling
Yiran Zhao
Wenyue Zheng
Tianle Cai
Xuan Long Do
Kenji Kawaguchi
Anirudh Goyal
Michael Shieh
317
2
0
02 Mar 2024
Coercing LLMs to do and reveal (almost) anything
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping
Alex Stein
Manli Shu
Khalid Saifullah
Yuxin Wen
Tom Goldstein
AAML
238
82
0
21 Feb 2024
The Wolf Within: Covert Injection of Malice into MLLM Societies via an
  MLLM Operative
The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative
Zhen Tan
Chengshuai Zhao
Raha Moraffah
Jiayi Zhang
Yu Kong
Tianlong Chen
Huan Liu
194
21
0
20 Feb 2024
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings
  for Robust Large Vision-Language Models
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Christian Schlarmann
Naman D. Singh
Francesco Croce
Matthias Hein
VLMAAML
389
86
0
19 Feb 2024
A Trembling House of Cards? Mapping Adversarial Attacks against Language
  Agents
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Lingbo Mo
Zeyi Liao
Boyuan Zheng
Yu-Chuan Su
Chaowei Xiao
Huan Sun
AAMLLLMAG
291
23
0
15 Feb 2024
Test-Time Backdoor Attacks on Multimodal Large Language Models
Test-Time Backdoor Attacks on Multimodal Large Language Models
Dong Lu
Tianyu Pang
Chao Du
Qian Liu
Xianjun Yang
Min Lin
AAML
383
37
0
13 Feb 2024
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM
  Agents Exponentially Fast
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Ye Wang
Jing Jiang
Min Lin
LLMAGLM&Ro
232
97
0
13 Feb 2024
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank
  Compression Strategy
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy
Seyedarmin Azizi
M. Nazemi
Massoud Pedram
ViTMQ
255
5
0
08 Feb 2024
Previous
1234
Next