35
1

Efficient Data-Driven Leverage Score Sampling Algorithm for the Minimum Volume Covering Ellipsoid Problem in Big Data

Abstract

The Minimum Volume Covering Ellipsoid (MVCE) problem, characterised by nn observations in dd dimensions where ndn \gg d, can be computationally very expensive in the big data regime. We apply methods from randomised numerical linear algebra to develop a data-driven leverage score sampling algorithm for solving MVCE, and establish theoretical error bounds and a convergence guarantee. Assuming the leverage scores follow a power law decay, we show that the computational complexity of computing the approximation for MVCE is reduced from O(nd2)\mathcal{O}(nd^2) to O(nd+poly(d))\mathcal{O}(nd + \text{poly}(d)), which is a significant improvement in big data problems. Numerical experiments demonstrate the efficacy of our new algorithm, showing that it substantially reduces computation time and yields near-optimal solutions.

View on arXiv
Comments on this paper