v1v2 (latest)

qc-kmeans: A Quantum Compressive K-Means Algorithm for NISQ Devices

26 October 2025

Pedro Chumpitaz-Flores

My Duong

Ying Mao

Kaixun Hua

ArXiv (abs)PDF HTML Github

Main:9 Pages

4 Figures

Bibliography:1 Pages

3 Tables

Abstract

Clustering on NISQ hardware is constrained by data loading and limited qubits. We present \textbf{qc-kmeans}, a hybrid compressive $k$ -means that summarizes a dataset with a constant-size Fourier-feature sketch and selects centroids by solving small per-group QUBOs with shallow QAOA circuits. The QFF sketch estimator is unbiased with mean-squared error $O(\varepsilon^2)$ for $B,S=\Theta(\varepsilon^{-2})$ , and the peak-qubit requirement $q_{\text{peak}}=\max\{D,\lceil \log_2 B\rceil + 1\}$ does not scale with the number of samples. A refinement step with elitist retention ensures non-increasing surrogate cost. In Qiskit Aer simulations (depth $p{=}1$ ), the method ran with $\le 9$ qubits on low-dimensional synthetic benchmarks and achieved competitive sum-of-squared errors relative to quantum baselines; runtimes are not directly comparable. On nine real datasets (up to $4.3\times 10^5$ points), the pipeline maintained constant peak-qubit usage in simulation. Under IBM noise models, accuracy was similar to the idealized setting. Overall, qc-kmeans offers a NISQ-oriented formulation with shallow, bounded-width circuits and competitive clustering quality in simulation.

View on arXiv

Comments on this paper