We introduce a new algorithm to learn on the fly the parameter value from a sequence of independent copies of , with a parametric model. The main idea of the proposed approach is to define a sequence of probability distributions on which (i) is shown to concentrate on as and (ii) can be estimated in an online fashion by means of a standard particle filter (PF) algorithm. The sequence depends on a learning rate , with the slower converges to zero the greater is the ability of the PF approximation of to escape from a local optimum of the objective function, but the slower is the rate at which concentrates on . To conciliate ability to escape from a local optimum and fast convergence towards we exploit the acceleration property of averaging, well-known in the stochastic gradient descent literature, by letting be the proposed estimator of . Our numerical experiments suggest that converges to at the optimal rate in challenging models and in situations where concentrates on this parameter value at a slower rate. We illustrate the practical usefulness of the proposed optimization algorithm for online parameter learning and for computing the maximum likelihood estimator.
View on arXiv