Embedding Compression via Spherical Coordinates
Han Xiao
- MDE
Main:5 Pages
3 Figures
Bibliography:2 Pages
13 Tables
Appendix:5 Pages
Abstract
We present an -bounded compression method for unit-norm embeddings that achieves 1.5 compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around , causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is bounded by float32 machine epsilon (), making reconstructed values indistinguishable from originals at float32 precision. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent compression improvement with zero measurable retrieval degradation on BEIR benchmarks.
View on arXivComments on this paper
