A Reliable Effective Terascale Linear Learning System

Journal of machine learning research (JMLR), 2011

19 October 2011

Abstract

We present a system and a set of techniques for learning linear predictors with convex losses on terascale datasets, with trillions of features (the number of non-zero entries in the data matrix), billions of training examples and millions of parameters in an hour using a cluster of 1000 machines. One of the core techniques used is a new communication infrastructure---often referred to as AllReduce---implemented for compatibility with MapReduce clusters. The communication infrastructure appears broadly reusable for many other tasks. We also show the effectiveness of a hybrid online-batch approach for optimization in distributed settings.

View on arXiv

Comments on this paper