67

Learning the Language of NVMe Streams for Ransomware Detection

Main:10 Pages
10 Figures
Bibliography:3 Pages
20 Tables
Appendix:12 Pages
Abstract

We apply language modeling techniques to detect ransomware activity in NVMe command sequences. We design and train two types of transformer-based models: the Command-Level Transformer (CLT) performs in-context token classification to determine whether individual commands are initiated by ransomware, and the Patch-Level Transformer (PLT) predicts the volume of data accessed by ransomware within a patch of commands. We present both model designs and the corresponding tokenization and embedding schemes and show that they improve over state-of-the-art tabular methods by up to 24% in missed-detection rate, 66% in data loss prevention, and 84% in identifying data accessed by ransomware.

View on arXiv
Comments on this paper