18

Outlier Detection Using Vector Cosine Similarity by Adding a Dimension

Digital Signal Processing and Signal Processing Education Workshop (SPSPE), 2024
Zhongyang Shen
Main:4 Pages
3 Figures
Bibliography:1 Pages
5 Tables
Abstract

We propose a new outlier detection method for multi-dimensional data. The method detects outliers based on vector cosine similarity, using a new dataset constructed by adding a dimension with zero values to the original data. When a point in the new dataset is selected as the measured point, an observation point is created as the origin, differing only in the new dimension by having a non-zero value compared to the measured point. Vectors are then formed from the observation point to the measured point and to other points in the dataset. By comparing the cosine similarities of these vectors, abnormal data can be identified. An optimized implementation (MDOD) is available on PyPI:this https URL.

View on arXiv
Comments on this paper