Sparse Representations of Positive Functions via First and Second-Order
Pseudo-Mirror Descent
We consider expected risk minimization when the range of the estimator is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. To facilitate nonlinear interpolation, we hypothesize that search is conducted over a Reproducing Kernel Hilbert Space (RKHS). To solve it, we develop first and second-order variants of stochastic mirror descent employing (i) pseudo-gradients and (ii) complexity-reducing projections. Compressive projection in first-order scheme is executed via kernel orthogonal matching pursuit (KOMP), and overcome the fact that the vanilla RKHS parameterization grows unbounded with time. Moreover, pseudo-gradients are needed when stochastic estimates of the gradient of the expected cost are only computable up to some numerical errors, which arise in, e.g., integral approximations. The second-order scheme develops a Hessian inverse approximation via recursively averaged pseudo-gradient outer products. For the first-order scheme, we establish tradeoffs between accuracy of convergence in mean and the projection budget parameter under constant step-size and compression budget are established, as well as non-asymptotic bounds on the model complexity. Analogous convergence results are established for the second-order scheme under an additional eigenvalue decay condition on the Hessian of the optimal RKHS element. Experiments demonstrate favorable performance on inhomogeneous Poisson Process intensity estimation in practice.
View on arXiv