Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.12575
Cited By
LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks
19 December 2023
Saad Ullah
Mingji Han
Saurabh Pujar
Hammond Pearce
Ayse K. Coskun
Gianluca Stringhini
ELM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks"
14 / 14 papers shown
Title
AutoPatch: Multi-Agent Framework for Patching Real-World CVE Vulnerabilities
Minjae Seo
Wonwoo Choi
Myoungsung You
Seungwon Shin
KELM
52
0
0
07 May 2025
Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach
Penghui Li
Songchen Yao
Josef Sarfati Korich
Changhua Luo
Jianjia Yu
Yinzhi Cao
Junfeng Yang
35
0
0
22 Apr 2025
Frontier AI's Impact on the Cybersecurity Landscape
Wenbo Guo
Yujin Potter
Tianneng Shi
Zhun Wang
Andy Zhang
Dawn Song
36
1
0
07 Apr 2025
Do LLMs Consider Security? An Empirical Study on Responses to Programming Questions
Amirali Sajadi
Binh Le
A. Nguyen
Kostadin Damevski
Preetha Chatterjee
53
2
0
20 Feb 2025
LAMD: Context-driven Android Malware Detection and Classification with LLMs
Xingzhi Qian
Xinran Zheng
Yiling He
Shuo Yang
Lorenzo Cavallaro
78
2
0
18 Feb 2025
Can LLM Generate Regression Tests for Software Commits?
Jing Liu
Seongmin Lee
Eleonora Losiouk
Marcel Böhme
41
0
0
19 Jan 2025
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs
Reza Fayyazi
Stella Hoyos Trueba
Michael Zuzak
S. Yang
28
0
0
22 Oct 2024
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection
Niklas Risse
Marcel Bohme
Marcel Böhme
25
4
0
23 Aug 2024
What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering
Federico Errica
G. Siracusano
D. Sanvito
Roberto Bifulco
67
19
0
18 Jun 2024
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
Yuqiang Sun
Daoyuan Wu
Yue Xue
Han Liu
Wei Ma
Lyuye Zhang
Miaolei Shi
Yingjiu Li
ELM
76
46
0
29 Jan 2024
Large Language Models Understand and Can be Enhanced by Emotional Stimuli
Cheng-rong Li
Jindong Wang
Yixuan Zhang
Kaijie Zhu
Wenxin Hou
Jianxun Lian
Fang Luo
Qiang Yang
Xingxu Xie
LRM
67
116
0
14 Jul 2023
VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection
Hazim Hanif
S. Maffeis
53
92
0
25 May 2022
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
1