HashTag Erasure Codes: From Theory to Practice
Conventional erasure codes such as Reed-Solomon (RS) provide savings in the storage space, but at the cost of higher repair bandwidth and more complex computations than replication. Minimum-Storage Regenerating (MSR) codes have emerged as a viable alternative to RS codes as they minimize the repair bandwidth while still being optimal in terms of reliability and storage. Although several MSR code constructions exist, so far they have not been practically implemented. One of the main reasons for their practical abandonment is that existing MSR code constructions imply much bigger number of I/O operations than RS codes. In this paper, we analyze MDS codes that are simultaneously optimal in terms of storage, reliability, I/O operations and repair-bandwidth for single and multiple failures of the systematic nodes. Due to the resemblance between the hashtag sign \# and the construction procedure of these codes, we call them \emph{HashTag Erasure Codes (HTECs)}. HTECs provide the lowest data-read and data-transfer for an arbitrary sub-packetization level where among all existing solutions for distributed storage. The repair process is linear and highly parallel. Additionally, we show that HTECs are the first high-rate MDS codes that reduce the repair bandwidth for multiple failures.
View on arXiv