Malware Classification Based on Image Segmentation

Executable programs are highly structured files that can be recognized by operating systems and loaded into memory, analyzed for their dependencies, allocated resources, and ultimately executed. Each section of an executable program possesses distinct file and semantic boundaries, resembling puzzle pieces with varying shapes, textures, and sizes. These individualistic sections, when combined in diverse manners, constitute a complete executable program. This paper proposes a novel approach for the visualization and classification of malware. Specifically, we segment the grayscale images generated from malware binary files based on the section categories, resulting in multiple sub-images of different classes. These sub-images are then treated as multi-channel images and input into a deep convolutional neural network for malware classification. Experimental results demonstrate that images of different malware section classes exhibit favorable classification characteristics. Additionally, we discuss how the width alignment of malware grayscale images can influence the performance of the model.
View on arXiv