52
5

Secure MapReduce Power Iteration in the Cloud

Abstract

With the development and wide deployment of web services, mobile applications, and sensor networks, data are now collected from many distributed sources to form big datasets. This paradigm poses a number of challenges on data storage and analysis. Typical data collectors have limited capacity to store and process large volumes of data and collected data may be highly sensitive requiring secure storage and processing. Processing and analyzing such large-scale data may also require a significant investment on the computing infrastructure which can be prohibitively expensive for many users. With these problems in mind, we envision a cloud-based data storage and processing framework that enables users to economically and securely handle big datasets. Specifically, we develop a cloud-based, MapReduce implementation of the power iteration algorithm in which the source matrix and intermediate computational values remain confidential. Our approach uses an iterative processing model in which the user interacts with the vast computing resources and encrypted data in the cloud until a converged solution is obtained. The security of this approach is guaranteed using Paillier encryption and a random perturbation technique. We carefully analyze its resilience to attacks. Our experimental results show that the proposed method is scalable to big matrices while requiring low client-side costs.

View on arXiv
Comments on this paper