LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations

IEEE Working Conference on Mining Software Repositories (MSR), 2023

16 March 2023

Catherine Tony

Markus Mutas

Nicolás E. Díaz Ferreyra

Riccardo Scandariato

ELM

ArXiv (abs)PDF HTML

Papers citing "LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations"

34 / 34 papers shown

Secure Code Generation at Scale with Reflexion

121

05 Nov 2025

QCoder Benchmark: Bridging Language Generation and Quantum Hardware through Simulator-Based Feedback

...

254

30 Oct 2025

Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies

201

27 Oct 2025

SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios

...

144

26 Sep 2025

Detecting Stealthy Data Poisoning Attacks in AI Code Generators

Cristina Improta

AAML SILM

29 Aug 2025

From Language to Action: A Review of Large Language Models as Autonomous Agents and Tool Users

324

24 Aug 2025

Amazon Nova AI Challenge -- Trusted AI: Advancing secure, AI-assisted software development

...

Shankar Ananthakrishna

109

13 Aug 2025

Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective

200

15 May 2025

Frontier AI's Impact on the Cybersecurity Landscape

527

07 Apr 2025

SandboxEval: Towards Securing Test Environment for Untrusted Code

326

27 Mar 2025

Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Hamed Jelodar

Mohammad Meymani

Roozbeh Razavi-Far

255

21 Mar 2025

Rethinking the Evaluation of Secure Code Generation

460

18 Mar 2025

Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Unified Approach for Elevating Benchmark Quality

577

07 Mar 2025

Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models

217

09 Feb 2025

PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback

Yun Peng

Akhilesh Deepak Gotmare

265

18 Nov 2024

From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation by Natural Language Prompting

260

18 Oct 2024

Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub ScenariosIEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2024

Zhi Chen

Lingxiao Jiang

LLMAG

213

16 Oct 2024

APILOT: Navigating Large Language Models to Generate Secure Code by Sidestepping Outdated API Pitfalls

Kangjie Lu

175

25 Sep 2024

Dynamic Code Orchestration: Harnessing the Power of Large Language Models for Adaptive Script Execution

J. D. Vecchio

Andrew Perreault

Eliana Furmanek

07 Aug 2024

Prompting Techniques for Secure Code Generation: A Systematic Investigation

Catherine Tony

Nicolás E. Díaz Ferreyra

Markus Mutas

Salem Dhiff

Riccardo Scandariato

SILM

432

09 Jul 2024

INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness

Hung Le

Yingbo Zhou

Caiming Xiong

Silvio Savarese

Doyen Sahoo

274

23 Jun 2024

When LLMs Meet Cybersecurity: A Systematic Literature Review

395

143

06 May 2024

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for CodeInternational Conference on Learning Representations (ICLR), 2024

Tianjun Zhang

463

967

12 Mar 2024

Exploring Advanced Methodologies in Security Evaluation for LLMs

332

28 Feb 2024

Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

Ran Elgedawy

John Sadik

Senjuti Dutta

Konstantinos Georgiou

...

273

01 Feb 2024

NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness

282

29 Jan 2024

Towards Trustworthy AI Software Development Assistance

Daniel Maninger

Krishna Narasimhan

Mira Mezini

212

14 Dec 2023

Can LLMs Patch Security Issues?

502

13 Nov 2023

Automating the Correctness Assessment of AI-generated Code for Security ContextsJournal of Systems and Software (JSS), 2023

249

28 Oct 2023

LLM for SoC Security: A Paradigm ShiftIEEE Access (IEEE Access), 2023

360

09 Oct 2023

CompVPD: Iteratively Identifying Vulnerability Patches Based on Human Validation Results with a Precise Context

216

04 Oct 2023

Security Weaknesses of Copilot-Generated Code in GitHub Projects: An Empirical StudyACM Transactions on Software Engineering and Methodology (TOSEM), 2023

397

03 Oct 2023

Using ChatGPT as a Static Application Security Testing Tool

Atieh Bakhshandeh

Abdalsamad Keramatfar

Amir Norouzi

Mohammad Mahdi Chekidehkhoun

173

28 Aug 2023

Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning AttacksIEEE International Conference on Program Comprehension (ICPC), 2023

423

04 Aug 2023