Load Balancing for Skewed Streams on Heterogeneous Cluster

Primitive partitioning strategies for streaming applications operate efficiently under two very strict assumptions: the resources are homogeneous and the messages are drawn from a uniform key distribution. These assumptions are often not true for the real-world use cases. Dealing with heterogeneity and non-uniform workload requires inferring the resource capacities and input distribution at run time. However, gathering these statistics and finding an optimal placement often become a challenge when microsecond latency is desired. In this paper, we address the load balancing problem for streaming engines running on a heterogeneous cluster and processing skewed workload. In doing so, we propose a novel partitioning strategy called Consistent Grouping (CG) that is inspired by traditional consistent hashing. CG is a lightweight distributed strategy that enables each processing element instance (PEIs) to process the workload according to its capacity. The main idea behind CG is the notion of equal-sized virtual workers at the sources, which are assigned to workers based on their capacities. We provide a theoretical analysis of the proposed algorithm and show via extensive empirical evaluation that the proposed scheme outperforms the state-of-the-art approaches. In particular, CG achieves 3.44x superior performance in terms of latency compared to key grouping, which is the state-of-the-art grouping strategy for stateful streaming applications.
View on arXiv