0
0

Chunking Attacks on File Backup Services using Content-Defined Chunking

Boris Alexeev
Colin Percival
Yan X Zhang
Abstract

Systems such as file backup services often use content-defined chunking (CDC) algorithms, especially those based on rolling hash techniques, to split files into chunks in a way that allows for data deduplication. These chunking algorithms often depend on per-user parameters in an attempt to avoid leaking information about the data being stored. We present attacks to extract these chunking parameters and discuss protocol-agnostic attacks and loss of security once the parameters are breached (including when these parameters are not setup at all, which is often available as an option). Our parameter-extraction attacks themselves are protocol-specific but their ideas are generalizable to many potential CDC schemes.

View on arXiv
@article{alexeev2025_2504.02095,
  title={ Chunking Attacks on File Backup Services using Content-Defined Chunking },
  author={ Boris Alexeev and Colin Percival and Yan X Zhang },
  journal={arXiv preprint arXiv:2504.02095},
  year={ 2025 }
}
Comments on this paper