Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.04957
Cited By
Automatic and Universal Prompt Injection Attacks against Large Language Models
7 March 2024
Xiaogeng Liu
Zhiyuan Yu
Yizhe Zhang
Ning Zhang
Chaowei Xiao
SILM
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Automatic and Universal Prompt Injection Attacks against Large Language Models"
25 / 25 papers shown
Title
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Chetan Pathade
AAML
SILM
40
0
0
07 May 2025
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
Francisco Aguilera-Martínez
Fernando Berzal
PILM
45
0
0
02 May 2025
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
67
0
0
01 May 2025
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
Y. Chen
Haoran Li
Yuan Sui
Y. Liu
Yufei He
Y. Song
Bryan Hooi
AAML
SILM
61
0
0
29 Apr 2025
PICO: Secure Transformers via Robust Prompt Isolation and Cybersecurity Oversight
Ben Goertzel
Paulos Yibelo
SILM
54
0
0
26 Apr 2025
Adversarial Attacks on LLM-as-a-Judge Systems: Insights from Prompt Injections
Narek Maloyan
Dmitry Namiot
SILM
AAML
ELM
70
0
0
25 Apr 2025
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Le Wang
Zonghao Ying
Tianyuan Zhang
Siyuan Liang
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
AAML
31
1
0
19 Apr 2025
DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks
Yupei Liu
Yuqi Jia
Jinyuan Jia
Dawn Song
Neil Zhenqiang Gong
AAML
31
0
0
15 Apr 2025
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models
Yang Feng
Xudong Pan
AAML
26
0
0
14 Apr 2025
Separator Injection Attack: Uncovering Dialogue Biases in Large Language Models Caused by Role Separators
Xitao Li
H. Wang
Jiang Wu
Ting Liu
AAML
23
0
0
08 Apr 2025
Iterative Prompting with Persuasion Skills in Jailbreaking Large Language Models
Shih-Wen Ke
Guan-Yu Lai
Guo-Lin Fang
Hsi-Yuan Kao
SILM
79
0
0
26 Mar 2025
RAG-Thief: Scalable Extraction of Private Data from Retrieval-Augmented Generation Applications with Agent-based Attacks
Changyue Jiang
Xudong Pan
Geng Hong
Chenfu Bao
Min Yang
SILM
69
7
0
21 Nov 2024
Rethinking the Intermediate Features in Adversarial Attacks: Misleading Robotic Models via Adversarial Distillation
Ke Zhao
Huayang Huang
Miao Li
Yu Wu
AAML
66
0
0
21 Nov 2024
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems
Donghyun Lee
Mo Tiwari
LLMAG
18
8
0
09 Oct 2024
Functional Homotopy: Smoothing Discrete Optimization via Continuous Parameters for LLM Jailbreak Attacks
Zi Wang
Divyam Anshumaan
Ashish Hooda
Yudong Chen
Somesh Jha
AAML
35
0
0
05 Oct 2024
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
Jiahao Yu
Yangguang Shao
Hanwen Miao
Junzheng Shi
SILM
AAML
53
3
0
23 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
45
1
0
05 Sep 2024
SafeEmbodAI: a Safety Framework for Mobile Robots in Embodied AI Systems
Wenxiao Zhang
Xiangrui Kong
Thomas Braunl
Jin B. Hong
18
2
0
03 Sep 2024
On Large Language Models in National Security Applications
William N. Caballero
Phillip R. Jenkins
ELM
21
6
0
03 Jul 2024
Semantic-guided Prompt Organization for Universal Goal Hijacking against LLMs
Yihao Huang
Chong Wang
Xiaojun Jia
Qing-Wu Guo
Felix Juefei Xu
Jian Zhang
G. Pu
Yang Liu
17
8
0
23 May 2024
Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation
Yuxi Li
Yi Liu
Yuekang Li
Ling Shi
Gelei Deng
Shengquan Chen
Kailong Wang
24
12
0
20 May 2024
Sociotechnical Implications of Generative Artificial Intelligence for Information Access
Bhaskar Mitra
Henriette Cramer
Olya Gurevich
18
2
0
19 May 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector
Shang-Tse Chen
Cory Cornelius
Jason Martin
Duen Horng Chau
ObjD
136
422
0
16 Apr 2018
1