ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.05902
65
1

Mini-Batch Kernel kkk-means

8 October 2024
Ben Jourdan
Gregory Schwartzman
ArXiv (abs)PDFHTML
Abstract

We present the first mini-batch kernel kkk-means algorithm, offering an order of magnitude improvement in running time compared to the full batch algorithm. A single iteration of our algorithm takes O~(kb2)\widetilde{O}(kb^2)O(kb2) time, significantly faster than the O(n2)O(n^2)O(n2) time required by the full batch kernel kkk-means, where nnn is the dataset size and bbb is the batch size. Extensive experiments demonstrate that our algorithm consistently achieves a 10-100x speedup with minimal loss in quality, addressing the slow runtime that has limited kernel kkk-means adoption in practice. We further complement these results with a theoretical analysis under an early stopping condition, proving that with a batch size of Ω~(max⁡{γ4,γ2}⋅ϵ−2)\widetilde{\Omega}(\max \{\gamma^{4}, \gamma^{2}\} \cdot \epsilon^{-2})Ω(max{γ4,γ2}⋅ϵ−2), the algorithm terminates in O(γ2/ϵ)O(\gamma^2/\epsilon)O(γ2/ϵ) iterations with high probability, where γ\gammaγ bounds the norm of points in feature space and ϵ\epsilonϵ is a termination threshold. Our analysis holds for any reasonable center initialization, and when using kkk-means++ initialization, the algorithm achieves an approximation ratio of O(log⁡k)O(\log k)O(logk) in expectation. For normalized kernels, such as Gaussian or Laplacian it holds that γ=1\gamma=1γ=1. Taking ϵ=O(1)\epsilon = O(1)ϵ=O(1) and b=Θ(log⁡n)b=\Theta(\log n)b=Θ(logn), the algorithm terminates in O(1)O(1)O(1) iterations, with each iteration running in O~(k)\widetilde{O}(k)O(k) time.

View on arXiv
Comments on this paper