BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents

5 June 2024

Papers citing "BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents"

4 / 4 papers shown

Title
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks Ang Li Yin Zhou Vethavikashini Chithrra Raghuram Tom Goldstein Micah Goldblum AAML 71 7 0 12 Feb 2025
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Hanrong Zhang Jingyuan Huang Kai Mei Yifei Yao Zhenting Wang Chenlu Zhan Hongwei Wang Yongfeng Zhang AAML LLMAG ELM 48 18 0 03 Oct 2024
Poisoning Language Models During Instruction Tuning Alexander Wan Eric Wallace Sheng Shen Dan Klein SILM 90 124 0 01 May 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,730 0 04 Mar 2022