ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.07688
  4. Cited By
CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation
  for Evaluating LLMs in Cybersecurity Knowledge

CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge

12 February 2024
Norbert Tihanyi
M. Ferrag
Ridhi Jain
Tamás Bisztray
Merouane Debbah
    ELM
ArXivPDFHTML

Papers citing "CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge"

13 / 13 papers shown
Title
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
Paul Kassianik
Baturay Saglam
Alexander Chen
Blaine Nelson
Anu Vellore
...
Hyrum Anderson
Kojin Oshiba
Omar Santos
Yaron Singer
Amin Karbasi
PILM
56
0
0
28 Apr 2025
Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey
Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey
Shuang Tian
Tao Zhang
J. Liu
Jiacheng Wang
Xuangou Wu
...
Ruichen Zhang
W. Zhang
Zhenhui Yuan
Shiwen Mao
Dong In Kim
48
0
0
22 Apr 2025
MEQA: A Meta-Evaluation Framework for Question & Answer LLM Benchmarks
MEQA: A Meta-Evaluation Framework for Question & Answer LLM Benchmarks
Jaime Raldua Veuthey
Zainab Ali Majid
Suhas Hariharan
Jacob Haimes
ELM
26
0
0
18 Apr 2025
The Digital Cybersecurity Expert: How Far Have We Come?
The Digital Cybersecurity Expert: How Far Have We Come?
Dawei Wang
Geng Zhou
Xianglong Li
Yu Bai
Li Chen
Ting Qin
Jian Sun
D. Li
ELM
57
0
0
16 Apr 2025
Large Language Models are Unreliable for Cyber Threat Intelligence
Large Language Models are Unreliable for Cyber Threat Intelligence
Emanuele Mezzi
Fabio Massacci
Katja Tuma
31
0
0
29 Mar 2025
CyberLLMInstruct: A New Dataset for Analysing Safety of Fine-Tuned LLMs Using Cyber Security Data
CyberLLMInstruct: A New Dataset for Analysing Safety of Fine-Tuned LLMs Using Cyber Security Data
Adel ElZemity
Budi Arief
Shujun Li
54
0
0
12 Mar 2025
AttackSeqBench: Benchmarking Large Language Models' Understanding of Sequential Patterns in Cyber Attacks
Javier Yong
Haokai Ma
Yunshan Ma
Anis Yusof
Zhenkai Liang
E. Chang
52
0
0
05 Mar 2025
OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities
OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities
Michael Kouremetis
Marissa Dotter
Alex Byrne
Dan Martin
Ethan Michalak
Gianpaolo Russo
Michael Threet
Guido Zarrella
ELM
50
4
0
18 Feb 2025
SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity
SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity
Pengfei Jing
Mengyun Tang
Xiaorong Shi
Xing Zheng
Sen Nie
Shi Wu
Yong Yang
Xiapu Luo
ELM
43
1
0
30 Dec 2024
Multi-Agent Collaboration in Incident Response with Large Language
  Models
Multi-Agent Collaboration in Incident Response with Large Language Models
Zefang Liu
LLMAG
AI4CE
71
0
0
01 Dec 2024
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI
  with a Focus on Model Confidence
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Norbert Tihanyi
Tamás Bisztray
Richard A. Dubniczky
Rebeka Tóth
B. Borsos
...
Ryan Marinelli
Lucas C. Cordeiro
Merouane Debbah
Vasileios Mavroeidis
Audun Josang
16
4
0
20 Oct 2024
Advancing Cyber Incident Timeline Analysis Through Rule Based AI and
  Large Language Models
Advancing Cyber Incident Timeline Analysis Through Rule Based AI and Large Language Models
Fatma Yasmine Loumachi
Mohamed Chahine Ghanem
AI4CE
36
1
0
04 Sep 2024
Aligning Offline Metrics and Human Judgments of Value for Code
  Generation Models
Aligning Offline Metrics and Human Judgments of Value for Code Generation Models
Victor C. Dibia
Adam Fourney
Gagan Bansal
Forough Poursabzi-Sangdeh
Han Liu
Saleema Amershi
ALM
OffRL
33
12
0
29 Oct 2022
1