RedTeamLLM: an Agentic AI framework for offensive security

11 May 2025

Brian Challita

Pierre Parrend

LLMAG

ArXiv (abs)PDF HTML Github (14★)

Papers citing "RedTeamLLM: an Agentic AI framework for offensive security"

25 / 25 papers shown

Hiding in the AI Traffic: Abusing MCP for LLM-Powered Agentic Red Teaming

Strahinja Janjusevic

Anna Baron Garcia

Sohrob Kazerounian

280

20 Nov 2025

Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and Opportunities

182

08 Sep 2025

Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and ImprovementsUser Modeling, Adaptation, and Personalization (UMAP), 2024

531

24 Feb 2025

HackSynth: LLM Agent and Evaluation Framework for Autonomous Penetration Testing

318

02 Dec 2024

EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific EvaluationsNeural Information Processing Systems (NeurIPS), 2024

Jia Li

Yongbin Li

303

30 Oct 2024

Countering Autonomous Cyber Threats

Kade M. Heckel

Adrian Weller

AAML

198

23 Oct 2024

Security Threats in Agentic AI System

374

16 Oct 2024

CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models

...

382

02 Aug 2024

LLM Agents can Autonomously Exploit One-day Vulnerabilities

510

137

11 Apr 2024

Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models

Arijit Ghosh Chowdhury

Vinija Jain

337

03 Mar 2024

AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks

376

130

02 Mar 2024

TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation

558

15 Feb 2024

LLM Agents can Autonomously Hack Websites

329

108

06 Feb 2024

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Dorsa Sadigh

380

154

07 Dec 2023

A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the UglyHigh-Confidence Computing (HC), 2023

675

1,127

04 Dec 2023

ADaPT: As-Needed Decomposition and Planning with Language Models

Archiki Prasad

Mohit Bansal

401

166

08 Nov 2023

Decoding the Threat Landscape : ChatGPT, FraudGPT, and WormGPT in Social Engineering AttacksInternational Journal of Scientific Research in Computer Science Engineering and Information Technology (JCSEIT), 2023

Polra Victor Falade

AAML

198

09 Oct 2023

PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long DocumentsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Shuohang Wang

248

23 May 2023

Tree of Thoughts: Deliberate Problem Solving with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023

Dian Yu

766

3,713

17 May 2023

DarkBERT: A Language Model for the Dark Side of the InternetAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yongjae Lee

221

15 May 2023

Structured Chain-of-Thought Prompting for Code GenerationACM Transactions on Software Engineering and Methodology (TOSEM), 2023

518

307

11 May 2023

Automatic Chain of Thought Prompting in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022

660

932

07 Oct 2022

ReAct: Synergizing Reasoning and Acting in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

Dian Yu

3.4K

7,139

06 Oct 2022

Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

3.7K

6,409

21 Mar 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

2.8K

17,183

28 Jan 2022