v1v2 (latest)

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents

5 March 2024

ArXiv (abs)PDF HTML Github (63★)

Papers citing "InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents"

35 / 135 papers shown

Can Indirect Prompt Injection Attacks Be Detected and Removed?Annual Meeting of the Association for Computational Linguistics (ACL), 2025

509

23 Feb 2025

RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage

381

17 Feb 2025

G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent SystemsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

182

16 Feb 2025

Peering Behind the Shield: Guardrail Identification in Large Language Models

253

03 Feb 2025

Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language ModelsKnowledge Discovery and Data Mining (KDD), 2023

452

149

28 Jan 2025

The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents

334

21 Dec 2024

Towards Action Hijacking of Large Language Model-based Agent

511

14 Dec 2024

Attacking Vision-Language Computer Agents via Pop-upsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

409

04 Nov 2024

Defense Against Prompt Injection Attack by Leveraging Attack TechniquesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

624

01 Nov 2024

InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models

Haoyang Li

Xiaogeng Liu

SILM

459

30 Oct 2024

FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks

Jiongxiao Wang

Muhao Chen

201

28 Oct 2024

MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

208

23 Oct 2024

Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You InNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

200

22 Oct 2024

Large Language Models, and LLM-Based Agents, Should Be Used to Enhance the Digital Public Sphere

261

15 Oct 2024

DAWN: Designing Distributed Agents in a Worldwide NetworkIEEE Access (IEEE Access), 2024

415

11 Oct 2024

Instructional Segment Embedding: Improving LLM Safety with Instruction HierarchyInternational Conference on Learning Representations (ICLR), 2024

406

09 Oct 2024

Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based AgentsInternational Conference on Learning Representations (ICLR), 2024

567

101

03 Oct 2024

System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective

Fangzhou Wu

Ethan Cecchetti

Chaowei Xiao

392

27 Sep 2024

Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification

Michael Backes

280

30 Jul 2024

The Emerged Security and Privacy of LLM Agent: A Survey with Case StudiesACM Computing Surveys (ACM CSUR), 2024

462

28 Jul 2024

Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents

281

12 Jul 2024

Soft Begging: Modular and Efficient Shielding of LLMs against Prompt Injection and Jailbreaking based on Prompt Tuning

243

03 Jul 2024

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM AgentsNeural Information Processing Systems (NeurIPS), 2024

Florian Tramèr

407

19 Jun 2024

Christian Schroeder de Witt

319

17 Jun 2024

Ranking Manipulation for Conversational Search Engines

324

05 Jun 2024

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

Sheng Wen

406

130

04 Jun 2024

A Survey of Useful LLM Evaluation

Yen-Ting Lin

289

03 Jun 2024

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Richard Fang

322

02 Jun 2024

Tool Learning with Large Language Models: A Survey

Jun Xu

335

211

28 May 2024

Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation

409

20 May 2024

When LLMs Meet Cybersecurity: A Systematic Literature Review

403

144

06 May 2024

Testing and Understanding Erroneous Planning in LLM Agents through Synthesized User Inputs

Daoyuan Wu

240

27 Apr 2024

LLM Agents can Autonomously Exploit One-day Vulnerabilities

401

110

11 Apr 2024

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents

...

Rui Wang

399

139

18 Jan 2024

Privacy in Large Language Models: Attacks, Defenses and Future Directions

441

16 Oct 2023