ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
  • Feedback
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.12815
  4. Cited By
Formalizing and Benchmarking Prompt Injection Attacks and Defenses
v1v2v3 (latest)

Formalizing and Benchmarking Prompt Injection Attacks and Defenses

19 October 2023
Yupei Liu
Yuqi Jia
Runpeng Geng
Jinyuan Jia
Neil Zhenqiang Gong
    SILMLLMAG
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (205★)

Papers citing "Formalizing and Benchmarking Prompt Injection Attacks and Defenses"

50 / 58 papers shown
Title
Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation
Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation
Baolei Zhang
Haoran Xin
Yuxi Chen
Zhuqing Liu
Biao Yi
Tong Li
Lihai Nie
Zheli Liu
Minghong Fang
SILM
32
0
0
17 Sep 2025
A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks
A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks
S M Asif Hossain
Ruksat Khan Shayoni
Mohd Ruhul Ameen
Akif Islam
M. F. Mridha
Jungpil Shin
LLMAGSILMAAML
32
0
0
16 Sep 2025
Free-MAD: Consensus-Free Multi-Agent Debate
Free-MAD: Consensus-Free Multi-Agent Debate
Yu Cui
Hang Fu
Haibin Zhang
Licheng Wang
Cong Zuo
LRM
0
0
0
14 Sep 2025
LLM in the Middle: A Systematic Review of Threats and Mitigations to Real-World LLM-based Systems
LLM in the Middle: A Systematic Review of Threats and Mitigations to Real-World LLM-based Systems
Vitor Hugo Galhardo Moia
Igor Jochem Sanz
Gabriel Antonio Fontes Rebello
Rodrigo Duarte de Meneses
Briland Hitaj
Ulf Lindqvist
4
0
0
12 Sep 2025
On the Security of Tool-Invocation Prompts for LLM-Based Agentic Systems: An Empirical Risk Assessment
On the Security of Tool-Invocation Prompts for LLM-Based Agentic Systems: An Empirical Risk Assessment
Yuchong Xie
Mingyu Luo
Zesen Liu
Z. Zhang
Kaikai Zhang
Yu Liu
Zongjie Li
Ping Chen
Shuai Wang
Dongdong She
LLMAG
0
0
0
06 Sep 2025
A Comprehensive Survey on Trustworthiness in Reasoning with Large Language Models
A Comprehensive Survey on Trustworthiness in Reasoning with Large Language Models
Yanbo Wang
Yongcan Yu
Jian Liang
Ran He
HILMLRM
21
0
0
04 Sep 2025
PromptSleuth: Detecting Prompt Injection via Semantic Intent Invariance
PromptSleuth: Detecting Prompt Injection via Semantic Intent Invariance
Mengxiao Wang
Yuxuan Zhang
Guofei Gu
AAMLSILM
32
0
0
28 Aug 2025
Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning
Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning
Yanbo Dai
Zhenlan Ji
Zongjie Li
Kuan Li
Shuai Wang
SILMAAMLKELM
41
0
0
27 Aug 2025
SoK: Large Language Model Copyright Auditing via Fingerprinting
SoK: Large Language Model Copyright Auditing via Fingerprinting
Shuo Shao
Yiming Li
Yexiao He
Hongwei Yao
Wenyuan Yang
D. Tao
Zhan Qin
44
0
0
27 Aug 2025
UniC-RAG: Universal Knowledge Corruption Attacks to Retrieval-Augmented Generation
UniC-RAG: Universal Knowledge Corruption Attacks to Retrieval-Augmented Generation
Runpeng Geng
Yanting Wang
Ying Chen
Jinyuan Jia
AAML
28
0
0
26 Aug 2025
Prompt-in-Content Attacks: Exploiting Uploaded Inputs to Hijack LLM Behavior
Prompt-in-Content Attacks: Exploiting Uploaded Inputs to Hijack LLM Behavior
Zhuotao Lian
Weiyu Wang
Qingkui Zeng
Toru Nakanishi
Teruaki Kitasuka
Chunhua Su
SILM
40
0
0
25 Aug 2025
When AIOps Become "AI Oops": Subverting LLM-driven IT Operations via Telemetry Manipulation
When AIOps Become "AI Oops": Subverting LLM-driven IT Operations via Telemetry Manipulation
Dario Pasquini
Evgenios M. Kornaropoulos
G. Ateniese
Omer Akgul
Athanasios Theocharis
Petros Efstathopoulos
AAML
26
0
0
08 Aug 2025
AttnTrace: Attention-based Context Traceback for Long-Context LLMs
AttnTrace: Attention-based Context Traceback for Long-Context LLMs
Yanting Wang
Runpeng Geng
Ying Chen
Jinyuan Jia
LLMAG
47
0
1
05 Aug 2025
Provably Secure Retrieval-Augmented Generation
Provably Secure Retrieval-Augmented Generation
Pengcheng Zhou
Yinglun Feng
Zhongliang Yang
SILM
42
0
0
01 Aug 2025
Understanding the Supply Chain and Risks of Large Language Model Applications
Understanding the Supply Chain and Risks of Large Language Model Applications
Yujie Ma
Lili Quan
Xiaofei Xie
Qiang Hu
Jiongchi Yu
Y. Zhang
S. Chen
ELM
73
0
0
24 Jul 2025
Defending Against Prompt Injection With a Few DefensiveTokens
Defending Against Prompt Injection With a Few DefensiveTokens
Sizhe Chen
Yizhu Wang
Nicholas Carlini
Chawin Sitawarin
David Wagner
LLMAGAAMLSILM
93
2
0
10 Jul 2025
The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover
The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover
Matteo Lupinacci
Francesco Aurelio Pironti
Francesco Blefari
Francesco Romeo
Luigi Arena
Angelo Furfaro
LLMAGAAML
105
1
0
09 Jul 2025
Design Patterns for Securing LLM Agents against Prompt Injections
Luca Beurer-Kellner
Beat Buesser Ana-Maria Creţu
Edoardo Debenedetti
Daniel Dobos
Daniel Fabian
...
Daniel Naeff
Ezinwanne Ozoani
Andrew Paverd
F. Tramèr
Václav Volhejn
LLMAGSILMAAML
107
5
0
10 Jun 2025
JavelinGuard: Low-Cost Transformer Architectures for LLM Security
JavelinGuard: Low-Cost Transformer Architectures for LLM Security
Yash Datta
Sharath Rajasekar
80
0
0
09 Jun 2025
Detection Method for Prompt Injection by Integrating Pre-trained Model and Heuristic Feature Engineering
Detection Method for Prompt Injection by Integrating Pre-trained Model and Heuristic Feature Engineering
Yi Ji
Runzhi Li
Baolei Mao
AAML
63
0
0
05 Jun 2025
TracLLM: A Generic Framework for Attributing Long Context LLMs
TracLLM: A Generic Framework for Attributing Long Context LLMs
Yanting Wang
Wei Zou
Runpeng Geng
Jinyuan Jia
LLMAG
231
0
0
04 Jun 2025
ETDI: Mitigating Tool Squatting and Rug Pull Attacks in Model Context Protocol (MCP) by using OAuth-Enhanced Tool Definitions and Policy-Based Access Control
ETDI: Mitigating Tool Squatting and Rug Pull Attacks in Model Context Protocol (MCP) by using OAuth-Enhanced Tool Definitions and Policy-Based Access Control
Manish Bhatt
Vineeth Sai Narajala
Idan Habler
93
1
0
02 Jun 2025
Simple Prompt Injection Attacks Can Leak Personal Data Observed by LLM Agents During Task Execution
Simple Prompt Injection Attacks Can Leak Personal Data Observed by LLM Agents During Task Execution
Meysam Alizadeh
Zeynab Samei
Daria Stetsenko
Fabrizio Gilardi
SILM
112
3
0
01 Jun 2025
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
Xinyue Shen
Yun Shen
Michael Backes
Yang Zhang
75
0
0
30 May 2025
LLM Agents Should Employ Security Principles
LLM Agents Should Employ Security Principles
Kaiyuan Zhang
Zian Su
Pin-Yu Chen
E. Bertino
Xiangyu Zhang
Ninghui Li
LLMAG
157
5
0
29 May 2025
Securing AI Agents with Information-Flow Control
Securing AI Agents with Information-Flow Control
Manuel Costa
Boris Köpf
Aashish Kolluri
Andrew Paverd
M. Russinovich
Ahmed Salem
Shruti Tople
Lukas Wutschitz
Santiago Zanella Béguelin
448
4
0
29 May 2025
Spa-VLM: Stealthy Poisoning Attacks on RAG-based VLM
Spa-VLM: Stealthy Poisoning Attacks on RAG-based VLM
Lei Yu
Yechao Zhang
Ziqi Zhou
Yang Wu
Wei Wan
Minghui Li
Shengshan Hu
Pei Xiaobing
Jing Wang
AAML
72
0
0
28 May 2025
Security Concerns for Large Language Models: A Survey
Security Concerns for Large Language Models: A Survey
Miles Q. Li
Benjamin C. M. Fung
PILMELM
248
6
0
24 May 2025
From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents
From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents
Liangxuan Wu
Chao Wang
Tianming Liu
Yanjie Zhao
Haoyu Wang
AAML
175
2
0
19 May 2025
WebInject: Prompt Injection Attack to Web Agents
WebInject: Prompt Injection Attack to Web Agents
Xilong Wang
John Bloch
Zedian Shao
Yuepeng Hu
Shuyan Zhou
Neil Zhenqiang Gong
AAMLLLMAG
191
2
0
16 May 2025
Practical Reasoning Interruption Attacks on Reasoning Large Language Models
Practical Reasoning Interruption Attacks on Reasoning Large Language Models
Yu Cui
Cong Zuo
SILMAAMLLRM
166
2
0
10 May 2025
Defending against Indirect Prompt Injection by Instruction Detection
Defending against Indirect Prompt Injection by Instruction Detection
Tongyu Wen
Chenglong Wang
Xiyuan Yang
Haoyu Tang
Yueqi Xie
Lingjuan Lyu
Zhicheng Dou
Fangzhao Wu
AAML
150
2
0
08 May 2025
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Chetan Pathade
AAMLSILM
300
11
0
07 May 2025
OET: Optimization-based prompt injection Evaluation Toolkit
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
244
0
0
01 May 2025
Prompt Injection Attack to Tool Selection in LLM Agents
Prompt Injection Attack to Tool Selection in LLM Agents
Jiawen Shi
Zenghui Yuan
Guiyao Tie
Pan Zhou
Neil Zhenqiang Gong
Lichao Sun
LLMAG
237
11
0
28 Apr 2025
Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey
Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey
Shuang Tian
Tao Zhang
Qingbin Liu
Jiacheng Wang
Xuangou Wu
...
Ruichen Zhang
Wentao Zhang
Zhenhui Yuan
Shiwen Mao
Dong In Kim
236
2
0
22 Apr 2025
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov
Arman Zharmagambetov
Aaron Grattafiori
Chuan Guo
Kamalika Chaudhuri
AAML
226
17
0
22 Apr 2025
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Le Wang
Zonghao Ying
Tianyuan Zhang
Yaning Tan
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
AAML
340
10
0
19 Apr 2025
DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks
DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks
Yupei Liu
Yuqi Jia
Jinyuan Jia
Dawn Song
Neil Zhenqiang Gong
AAML
187
19
0
15 Apr 2025
Understanding Users' Security and Privacy Concerns and Attitudes Towards Conversational AI Platforms
Understanding Users' Security and Privacy Concerns and Attitudes Towards Conversational AI Platforms
Mutahar Ali
Arjun Arunasalam
Habiba Farrukh
SILM
192
3
0
09 Apr 2025
Frontier AI's Impact on the Cybersecurity Landscape
Frontier AI's Impact on the Cybersecurity Landscape
Wenbo Guo
Yujin Potter
Tianneng Shi
Yu Yang
Andy Zhang
Dawn Song
180
9
0
07 Apr 2025
Practical Poisoning Attacks against Retrieval-Augmented Generation
Practical Poisoning Attacks against Retrieval-Augmented Generation
Baolei Zhang
Yuxiao Chen
Minghong Fang
Zhuqing Liu
Lihai Nie
Tong Li
Zheli Liu
SILMAAML
165
3
0
04 Apr 2025
RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage
RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage
Peter Yong Zhong
Siyuan Chen
Ruiqi Wang
McKenna McCall
Ben L. Titzer
Heather Miller
Phillip B. Gibbons
LLMAG
216
15
0
17 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs
OverThink: Slowdown Attacks on Reasoning LLMs
A. Kumar
Jaechul Roh
A. Naseh
Marzena Karpinska
Mohit Iyyer
Amir Houmansadr
Eugene Bagdasarian
LRM
257
36
0
04 Feb 2025
Peering Behind the Shield: Guardrail Identification in Large Language Models
Peering Behind the Shield: Guardrail Identification in Large Language Models
Ziqing Yang
Yixin Wu
Rui Wen
Michael Backes
Yang Zhang
137
2
0
03 Feb 2025
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi
Yueqi Xie
Bin Zhu
Emre Kiciman
Guangzhong Sun
Xing Xie
Fangzhao Wu
AAML
247
109
0
28 Jan 2025
An Empirically-grounded tool for Automatic Prompt Linting and Repair: A Case Study on Bias, Vulnerability, and Optimization in Developer Prompts
Dhia Elhaq Rzig
Dhruba Jyoti Paul
Kaiser Pister
Jordan Henkel
Foyzul Hassan
186
0
0
21 Jan 2025
Non-Halting Queries: Exploiting Fixed Points in LLMs
Non-Halting Queries: Exploiting Fixed Points in LLMs
Ghaith Hammouri
Kemal Derya
B. Sunar
117
0
0
08 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
H. Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAMLLLMAGELM
299
55
0
03 Oct 2024
Recent Advances in Attack and Defense Approaches of Large Language
  Models
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILMAAML
185
5
0
05 Sep 2024
12
Next