361

Parallel resampling in the particle filter

Abstract

Modern parallel computing devices, such as the graphics processing unit (GPU), have gained significant traction in scientific and statistical computing. They are particularly well-suited to data-parallel algorithms such as the particle filter, used in signal processing, object tracking and statistical inference. The particle filter carries a set of weighted particles through repeated propagation, weighting and resampling steps. The propagation and weighting steps are straightforward to parallelise, as they require only independent operations on each particle. The resampling step is more difficult, as it may require a collective operation, such as a sum, across particle weights. Focusing on this resampling step, we analyse a number of commonly-used algorithms (multinomial, stratified and systematic resamplers), as well as two rarely-used alternatives that do not involve a collective operation (Metropolis and rejection resamplers). We find that, in certain circumstances, the Metropolis and rejection resamplers can perform significantly faster on the GPU, and to a lesser extent on the CPU, than the commonly-used approaches. Moreover, in single precision, the commonly-used approaches are numerically biased for upwards of hundreds of thousands of particles, while the alternatives are not. This is particularly important given the significantly greater single- than double-precision throughput of modern devices, and the consequent temptation to use single precision with a great number of particles. Finally, we provide a number of auxiliary functions useful for implementation, such as for the permutation of ancestry vectors to enable in-place propagation.

View on arXiv
Comments on this paper