AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval

Graph-based approximate nearest neighbor search (ANNS) algorithms work effectively against large-scale vector retrieval. Among such methods, DiskANN achieves good recall-speed tradeoffs using both DRAM and storage. DiskANN adopts product quantization (PQ) to reduce memory usage, which is still proportional to the scale of datasets. In this paper, we propose All-in-Storage ANNS with Product Quantization (AiSAQ), which offloads compressed vectors to the SSD index. Our method achieves 10 MB memory usage in query search with billion-scale datasets without critical latency degradation. AiSAQ also reduces the index load time for query search preparation, which enables fast switch between muitiple billion-scalethis http URLmethod can be applied to retrievers of retrieval-augmented generation (RAG) and be scaled out with multiple-server systems for emerging datasets. Our DiskANN-based implementation is available on GitHub.
View on arXiv@article{tatsuno2025_2404.06004, title={ AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval }, author={ Kento Tatsuno and Daisuke Miyashita and Taiga Ikeda and Kiyoshi Ishiyama and Kazunari Sumiyoshi and Jun Deguchi }, journal={arXiv preprint arXiv:2404.06004}, year={ 2025 } }