526

Particle Filtering Methods for Stochastic Optimization with Application to Large-Scale Empirical Risk Minimization

Abstract

There is a recent interest in developing statistical filtering methods for stochastic optimization (FSO) by leveraging a probabilistic perspective of incremental proximity methods (IPMs). The existent FSO methods are derived based on the Kalman filter (KF) and extended KF (EKF). Different with classical stochastic optimization methods such as the stochastic gradient descent (SGD) and typical IPMs, such KF-type algorithms possess a desirable property, namely they do not require pre-scheduling of the learning rate for convergence. However, on the other side, they have inherent limitations inherited from the nature of KF mechanisms. It is a consensus that the class of particle filters (PFs) outperforms the KF and its variants remarkably for nonlinear and/or non-Gaussian statistical filtering tasks. Hence, it is natural to ask if the FSO methods can benefit from the PF theory to get around of the limitations of the KF-type IPMs. We provide an affirmative answer to the aforementioned question by developing three PF based stochastic optimization (PFSO) algorithms. For performance evaluation, we apply them to solve a least-square fitting problem using a simulated data set, and the empirical risk minimization (ERM) problem in binary classification using real data sets. Experimental results demonstrate that our algorithms outperform remarkably existent methods in terms of numerical stability, convergence speed, and flexibility in handling different types of loss functions.

View on arXiv
Comments on this paper