78
183

The Capacity of Private Information Retrieval from Byzantine and Colluding Databases

Abstract

We consider the problem of single-round private information retrieval (PIR) from NN replicated databases. We consider the case when BB databases are outdated (unsynchronized), or even worse, adversarial (Byzantine), and therefore, can return incorrect answers. In the PIR problem with Byzantine databases (BPIR), a user wishes to retrieve a specific message from a set of MM messages with zero-error, irrespective of the actions performed by the Byzantine databases. We consider the TT-privacy constraint in this paper, where any TT databases can collude, and exchange the queries submitted by the user. We derive the information-theoretic capacity of this problem, which is the maximum number of \emph{correct symbols} that can be retrieved privately (under the TT-privacy constraint) for every symbol of the downloaded data. We determine the exact BPIR capacity to be C=N2BN1TN2B1(TN2B)MC=\frac{N-2B}{N}\cdot\frac{1-\frac{T}{N-2B}}{1-(\frac{T}{N-2B})^M}, if 2B+T<N2B+T < N. This capacity expression shows that the effect of Byzantine databases on the retrieval rate is equivalent to removing 2B2B databases from the system, with a penalty factor of N2BN\frac{N-2B}{N}, which signifies that even though the number of databases needed for PIR is effectively N2BN-2B, the user still needs to access the entire NN databases. The result shows that for the unsynchronized PIR problem, if the user does not have any knowledge about the fraction of the messages that are mis-synchronized, the single-round capacity is the same as the BPIR capacity. Our achievable scheme extends the optimal achievable scheme for the robust PIR (RPIR) problem to correct the \emph{errors} introduced by the Byzantine databases as opposed to \emph{erasures} in the RPIR problem. Our converse proof uses the idea of the cut-set bound in the network coding problem against adversarial nodes.

View on arXiv
Comments on this paper