ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.14197
  4. Cited By
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models

Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models

28 January 2025
Jingwei Yi
Yueqi Xie
Bin Zhu
Emre Kiciman
Guangzhong Sun
Xing Xie
Fangzhao Wu
    AAML
ArXivPDFHTML

Papers citing "Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models"

50 / 57 papers shown
Title
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Chetan Pathade
AAML
SILM
38
0
0
07 May 2025
OET: Optimization-based prompt injection Evaluation Toolkit
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
67
0
0
01 May 2025
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
Y. Chen
Haoran Li
Yuan Sui
Y. Liu
Yufei He
Y. Song
Bryan Hooi
AAML
SILM
61
0
0
29 Apr 2025
AI Awareness
AI Awareness
X. Li
Haoyuan Shi
Rongwu Xu
Wei Xu
54
0
0
25 Apr 2025
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov
Arman Zharmagambetov
Aaron Grattafiori
Chuan Guo
Kamalika Chaudhuri
AAML
30
0
0
22 Apr 2025
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Le Wang
Zonghao Ying
Tianyuan Zhang
Siyuan Liang
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
AAML
31
1
0
19 Apr 2025
DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks
DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks
Yupei Liu
Yuqi Jia
Jinyuan Jia
Dawn Song
Neil Zhenqiang Gong
AAML
29
0
0
15 Apr 2025
Separator Injection Attack: Uncovering Dialogue Biases in Large Language Models Caused by Role Separators
Separator Injection Attack: Uncovering Dialogue Biases in Large Language Models Caused by Role Separators
Xitao Li
H. Wang
Jiang Wu
Ting Liu
AAML
21
0
0
08 Apr 2025
Detecting LLM-Written Peer Reviews
Detecting LLM-Written Peer Reviews
Vishisht Rao
Aounon Kumar
Himabindu Lakkaraju
Nihar B. Shah
DeLMO
AAML
71
0
0
20 Mar 2025
Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents
Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents
Juhee Kim
Woohyuk Choi
Byoungyoung Lee
LLMAG
68
1
0
17 Mar 2025
ASIDE: Architectural Separation of Instructions and Data in Language Models
ASIDE: Architectural Separation of Instructions and Data in Language Models
Egor Zverev
Evgenii Kortukov
Alexander Panfilov
Soroush Tabesh
Alexandra Volkova
Sebastian Lapuschkin
Wojciech Samek
Christoph H. Lampert
AAML
52
1
0
13 Mar 2025
EigenShield: Causal Subspace Filtering via Random Matrix Theory for Adversarially Robust Vision-Language Models
EigenShield: Causal Subspace Filtering via Random Matrix Theory for Adversarially Robust Vision-Language Models
Nastaran Darabi
Devashri Naik
Sina Tayebati
Dinithi Jayasuriya
Ranganath Krishnan
A. R. Trivedi
AAML
37
0
0
24 Feb 2025
Can Indirect Prompt Injection Attacks Be Detected and Removed?
Can Indirect Prompt Injection Attacks Be Detected and Removed?
Yulin Chen
Haoran Li
Yuan Sui
Yufei He
Yue Liu
Y. Song
Bryan Hooi
AAML
32
3
0
23 Feb 2025
An Empirically-grounded tool for Automatic Prompt Linting and Repair: A Case Study on Bias, Vulnerability, and Optimization in Developer Prompts
Dhia Elhaq Rzig
Dhruba Jyoti Paul
Kaiser Pister
Jordan Henkel
Foyzul Hassan
62
0
0
21 Jan 2025
Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines
Xiyang Hu
AAML
20
1
0
03 Jan 2025
Towards Action Hijacking of Large Language Model-based Agent
Towards Action Hijacking of Large Language Model-based Agent
Yuyang Zhang
Kangjie Chen
Xudong Jiang
Yuxiang Sun
Run Wang
Lina Wang
LLMAG
AAML
68
2
0
14 Dec 2024
RAG-Thief: Scalable Extraction of Private Data from Retrieval-Augmented
  Generation Applications with Agent-based Attacks
RAG-Thief: Scalable Extraction of Private Data from Retrieval-Augmented Generation Applications with Agent-based Attacks
Changyue Jiang
Xudong Pan
Geng Hong
Chenfu Bao
Min Yang
SILM
69
3
0
21 Nov 2024
Defense Against Prompt Injection Attack by Leveraging Attack Techniques
Defense Against Prompt Injection Attack by Leveraging Attack Techniques
Yulin Chen
Haoran Li
Zihao Zheng
Y. Song
Dekai Wu
Bryan Hooi
SILM
AAML
37
4
0
01 Nov 2024
FATH: Authentication-based Test-time Defense against Indirect Prompt
  Injection Attacks
FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks
Jiongxiao Wang
Fangzhou Wu
Wendi Li
Jinsheng Pan
Edward Suh
Zhuoqing Mao
Muhao Chen
Chaowei Xiao
AAML
21
6
0
28 Oct 2024
Palisade -- Prompt Injection Detection Framework
Palisade -- Prompt Injection Detection Framework
Sahasra Kokkula
Somanathan R
Nandavardhan R
Aashishkumar
G Divya
AAML
23
0
0
28 Oct 2024
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In
Itay Nakash
George Kour
Guy Uziel
Ateret Anaby-Tavor
AAML
LLMAG
18
2
0
22 Oct 2024
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Jingyu Zhang
Ahmed Elgohary
Ahmed Magooda
Daniel Khashabi
Benjamin Van Durme
33
2
0
11 Oct 2024
Recent advancements in LLM Red-Teaming: Techniques, Defenses, and
  Ethical Considerations
Recent advancements in LLM Red-Teaming: Techniques, Defenses, and Ethical Considerations
Tarun Raheja
Nilay Pochhi
AAML
38
1
0
09 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAML
LLMAG
ELM
34
17
0
03 Oct 2024
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data
Xuefeng Du
Reshmi Ghosh
Robert Sim
Ahmed Salem
Vitor Carvalho
Emily Lawton
Yixuan Li
Jack W. Stokes
VLM
AAML
26
3
0
01 Oct 2024
System-Level Defense against Indirect Prompt Injection Attacks: An
  Information Flow Control Perspective
System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective
Fangzhou Wu
Ethan Cecchetti
Chaowei Xiao
21
1
0
27 Sep 2024
Applying Pre-trained Multilingual BERT in Embeddings for Improved
  Malicious Prompt Injection Attacks Detection
Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection
M. Rahman
Hossain Shahriar
Fan Wu
A. Cuzzocrea
AAML
29
4
0
20 Sep 2024
CoCA: Regaining Safety-awareness of Multimodal Large Language Models
  with Constitutional Calibration
CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Jiahui Gao
Renjie Pi
Tianyang Han
Han Wu
Lanqing Hong
Lingpeng Kong
Xin Jiang
Zhenguo Li
26
5
0
17 Sep 2024
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for
  LLM Agents
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Edoardo Debenedetti
Jie Zhang
Mislav Balunović
Luca Beurer-Kellner
Marc Fischer
Florian Tramèr
LLMAG
AAML
32
25
1
19 Jun 2024
Threat Modelling and Risk Analysis for Large Language Model
  (LLM)-Powered Applications
Threat Modelling and Risk Analysis for Large Language Model (LLM)-Powered Applications
Stephen Burabari Tete
16
5
0
16 Jun 2024
Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications
Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications
Junlin Wang
Tianyi Yang
Roy Xie
Bhuwan Dhingra
SILM
AAML
23
3
0
10 Jun 2024
Ranking Manipulation for Conversational Search Engines
Ranking Manipulation for Conversational Search Engines
Samuel Pfrommer
Yatong Bai
Tanmay Gautam
Somayeh Sojoudi
SILM
28
4
0
05 Jun 2024
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Richard Fang
Antony Kellermann
Akul Gupta
Qiusi Zhan
Richard Fang
R. Bindu
Daniel Kang
LLMAG
20
27
0
02 Jun 2024
AI Risk Management Should Incorporate Both Safety and Security
AI Risk Management Should Incorporate Both Safety and Security
Xiangyu Qi
Yangsibo Huang
Yi Zeng
Edoardo Debenedetti
Jonas Geiping
...
Chaowei Xiao
Bo-wen Li
Dawn Song
Peter Henderson
Prateek Mittal
AAML
37
9
0
29 May 2024
The Instruction Hierarchy: Training LLMs to Prioritize Privileged
  Instructions
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Eric Wallace
Kai Y. Xiao
R. Leike
Lilian Weng
Johannes Heidecke
Alex Beutel
SILM
47
113
0
19 Apr 2024
LLM Agents can Autonomously Exploit One-day Vulnerabilities
LLM Agents can Autonomously Exploit One-day Vulnerabilities
Richard Fang
R. Bindu
Akul Gupta
Daniel Kang
SILM
LLMAG
66
52
0
11 Apr 2024
GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM
  Applications
GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications
Shishir G. Patil
Tianjun Zhang
Vivian Fang
Noppapon C Roy Huang
Uc Berkeley
Aaron Hao
Martin Casado
Joseph E. Gonzalez Raluca
Ada Popa
Ion Stoica
ALM
18
9
0
10 Apr 2024
Defending Against Indirect Prompt Injection Attacks With Spotlighting
Defending Against Indirect Prompt Injection Attacks With Spotlighting
Keegan Hines
Gary Lopez
Matthew Hall
Federico Zarfati
Yonatan Zunger
Emre Kiciman
AAML
SILM
16
22
0
20 Mar 2024
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Egor Zverev
Sahar Abdelnabi
Soroush Tabesh
Mario Fritz
Christoph H. Lampert
35
19
0
11 Mar 2024
Automatic and Universal Prompt Injection Attacks against Large Language
  Models
Automatic and Universal Prompt Injection Attacks against Large Language Models
Xiaogeng Liu
Zhiyuan Yu
Yizhe Zhang
Ning Zhang
Chaowei Xiao
SILM
AAML
25
33
0
07 Mar 2024
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated
  Large Language Model Agents
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Qiusi Zhan
Zhixiang Liang
Zifan Ying
Daniel Kang
LLMAG
42
35
0
05 Mar 2024
A New Era in LLM Security: Exploring Security Concerns in Real-World
  LLM-based Systems
A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems
Fangzhou Wu
Ning Zhang
Somesh Jha
P. McDaniel
Chaowei Xiao
19
42
0
28 Feb 2024
Follow My Instruction and Spill the Beans: Scalable Data Extraction from
  Retrieval-Augmented Generation Systems
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
Zhenting Qi
Hanlin Zhang
Eric Xing
Sham Kakade
Hima Lakkaraju
SILM
29
16
0
27 Feb 2024
WIPI: A New Web Threat for LLM-Driven Web Agents
WIPI: A New Web Threat for LLM-Driven Web Agents
Fangzhou Wu
Shutong Wu
Yulong Cao
Chaowei Xiao
LLMAG
26
17
0
26 Feb 2024
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical
  Gradient Analysis
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
Yueqi Xie
Minghong Fang
Renjie Pi
Neil Zhenqiang Gong
32
21
0
21 Feb 2024
StruQ: Defending Against Prompt Injection with Structured Queries
StruQ: Defending Against Prompt Injection with Structured Queries
Sizhe Chen
Julien Piet
Chawin Sitawarin
David A. Wagner
SILM
AAML
12
65
0
09 Feb 2024
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
Tongxin Yuan
Zhiwei He
Lingzhong Dong
Yiming Wang
Ruijie Zhao
...
Binglin Zhou
Fangqi Li
Zhuosheng Zhang
Rui Wang
Gongshen Liu
ELM
13
41
0
18 Jan 2024
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Renjie Pi
Tianyang Han
Jianshu Zhang
Yueqi Xie
Rui Pan
Qing Lian
Hanze Dong
Jipeng Zhang
Tong Zhang
AAML
17
53
0
05 Jan 2024
Formalizing and Benchmarking Prompt Injection Attacks and Defenses
Formalizing and Benchmarking Prompt Injection Attacks and Defenses
Yupei Liu
Yuqi Jia
Runpeng Geng
Jinyuan Jia
Neil Zhenqiang Gong
SILM
LLMAG
13
38
0
19 Oct 2023
LLM Platform Security: Applying a Systematic Evaluation Framework to
  OpenAI's ChatGPT Plugins
LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins
Umar Iqbal
Tadayoshi Kohno
Franziska Roesner
ELM
SILM
51
41
0
19 Sep 2023
12
Next