Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.04724
Cited By
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
7 December 2023
Manish P Bhatt
Sahana Chennabasappa
Cyrus Nikolaidis
Shengye Wan
Ivan Evtimov
Dominik Gabi
Daniel Song
Faizan Ahmad
Cornelius Aschermann
Lorenzo Fontana
Sasha Frolov
Ravi Prakash Giri
Dhaval Kapil
Yiannis Kozyrakis
David LeBlanc
James Milazzo
Aleksandar Straumann
Gabriel Synnaeve
Varun Vontimitta
Spencer Whitman
Joshua Saxe
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models"
13 / 13 papers shown
Title
SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models
Huining Cui
Wei Liu
AAML
ELM
28
0
0
12 May 2025
SecRepoBench: Benchmarking LLMs for Secure Code Generation in Real-World Repositories
Connor Dilgren
Purva Chiniya
Luke Griffith
Yu Ding
Yizheng Chen
40
0
0
29 Apr 2025
The Digital Cybersecurity Expert: How Far Have We Come?
Dawei Wang
Geng Zhou
Xianglong Li
Yu Bai
Li Chen
Ting Qin
Jian Sun
D. Li
ELM
57
0
0
16 Apr 2025
SandboxEval: Towards Securing Test Environment for Untrusted Code
Rafiqul Rabin
Jesse Hostetler
Sean McGregor
Brett Weir
Nick Judd
ELM
39
0
0
27 Mar 2025
A Framework for Evaluating Emerging Cyberattack Capabilities of AI
Mikel Rodriguez
Raluca Ada Popa
Four Flynn
Lihao Liang
Allan Dafoe
Anna Wang
ELM
53
2
0
14 Mar 2025
Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators
Blaine Quackenbush
P. Atzberger
3DPC
AI4CE
65
2
0
06 Mar 2025
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
78
1
0
09 Oct 2024
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Xinyi Zeng
Yuying Shang
Yutao Zhu
Jingyuan Zhang
Yu Tian
AAML
121
2
0
09 Oct 2024
CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models
Shengye Wan
Cyrus Nikolaidis
Daniel Song
David Molnar
James Crnkovich
...
Spencer Whitman
Stephanie Ding
Vlad Ionescu
Yue Li
Joshua Saxe
ELM
36
20
0
02 Aug 2024
When LLMs Meet Cybersecurity: A Systematic Literature Review
Jie Zhang
Haoyu Bu
Hui Wen
Yu Chen
Lun Li
Hongsong Zhu
28
36
0
06 May 2024
CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Manish P Bhatt
Sahana Chennabasappa
Yue Li
Cyrus Nikolaidis
Daniel Song
...
Yaohui Chen
Dhaval Kapil
David Molnar
Spencer Whitman
Joshua Saxe
ELM
30
32
0
19 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
42
6
0
12 Apr 2024
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
58
30
0
08 Apr 2024
1