20
v1v2v3 (latest)

Embedding Compression via Spherical Coordinates

Han Xiao
Main:5 Pages
3 Figures
Bibliography:2 Pages
13 Tables
Appendix:5 Pages
Abstract

We present an ϵ\epsilon-bounded compression method for unit-norm embeddings that achieves 1.5×\times compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around π/2\pi/2, causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is bounded by float32 machine epsilon (1.19×1071.19 \times 10^{-7}), making reconstructed values indistinguishable from originals at float32 precision. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent compression improvement with zero measurable retrieval degradation on BEIR benchmarks.

View on arXiv
Comments on this paper