31
0
v1v2v3 (latest)

Stream Clipper: Scalable Submodular Maximization on Stream

Abstract

We propose a streaming submodular maximization algorithm "stream clipper" that performs as well as the offline greedy algorithm on document/video summarization in practice. It adds elements from a stream either to a solution set SS or to an extra buffer BB based on two adaptive thresholds, and improves SS by a final greedy step that starts from SS adding elements from BB. During this process, swapping elements out of SS can occur if doing so yields improvements. The thresholds adapt based on if current memory utilization exceeds a budget, e.g., it increases the lower threshold, and removes from the buffer BB elements below the new lower threshold. We show that, while our approximation factor in the worst case is 1/21/2 (like in previous work, and corresponding to the tight bound), we show that there are data-dependent conditions where our bound falls within the range [1/2,11/e][1/2, 1-1/e]. In news and video summarization experiments, the algorithm consistently outperforms other streaming methods, and, while using significantly less computation and memory, performs similarly to the offline greedy algorithm.

View on arXiv
Comments on this paper