ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.11421
62
0

Enabling DBSCAN for Very Large-Scale High-Dimensional Spaces

18 November 2024
Yongyu Wang
ArXivPDFHTML
Abstract

DBSCAN is one of the most important non-parametric unsupervised data analysis tools. By applying DBSCAN to a dataset, two key analytical results can be obtained: (1) clustering data points based on density distribution and (2) identifying outliers in the dataset. However, the time complexity of the DBSCAN algorithm is O(n2β)O(n^2 \beta)O(n2β), where nnn is the number of data points and β=O(D)\beta = O(D)β=O(D), with DDD representing the dimensionality of the data space. As a result, DBSCAN becomes computationally infeasible when both nnn and DDD are large. In this paper, we propose a DBSCAN method based on spectral data compression, capable of efficiently processing datasets with a large number of data points (nnn) and high dimensionality (DDD). By preserving only the most critical structural information during the compression process, our method effectively removes substantial redundancy and noise. Consequently, the solution quality of DBSCAN is significantly improved, enabling more accurate and reliable results.

View on arXiv
Comments on this paper