ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.11003
19
12

Unadjusted Hamiltonian MCMC with Stratified Monte Carlo Time Integration

20 November 2022
Nawaf Bou-Rabee
Milo Marsden
ArXivPDFHTML
Abstract

A novel randomized time integrator is suggested for unadjusted Hamiltonian Monte Carlo (uHMC) in place of the usual Verlet integrator; namely, a stratified Monte Carlo (sMC) integrator which involves a minor modification to Verlet, and hence, is easy to implement. For target distributions of the form μ(dx)∝e−U(x)dx\mu(dx) \propto e^{-U(x)} dxμ(dx)∝e−U(x)dx where U:Rd→R≥0U: \mathbb{R}^d \to \mathbb{R}_{\ge 0}U:Rd→R≥0​ is both KKK-strongly convex and LLL-gradient Lipschitz, and initial distributions ν\nuν with finite second moment, coupling proofs reveal that an ε\varepsilonε-accurate approximation of the target distribution μ\muμ in L2L^2L2-Wasserstein distance W2\boldsymbol{\mathcal{W}}^2W2 can be achieved by the uHMC algorithm with sMC time integration using O((d/K)1/3(L/K)5/3ε−2/3log⁡(W2(μ,ν)/ε)+)O\left((d/K)^{1/3} (L/K)^{5/3} \varepsilon^{-2/3} \log( \boldsymbol{\mathcal{W}}^2(\mu, \nu) / \varepsilon)^+\right)O((d/K)1/3(L/K)5/3ε−2/3log(W2(μ,ν)/ε)+) gradient evaluations; whereas without additional assumptions the corresponding complexity of the uHMC algorithm with Verlet time integration is in general O((d/K)1/2(L/K)2ε−1log⁡(W2(μ,ν)/ε)+)O\left((d/K)^{1/2} (L/K)^2 \varepsilon^{-1} \log( \boldsymbol{\mathcal{W}}^2(\mu, \nu) / \varepsilon)^+ \right)O((d/K)1/2(L/K)2ε−1log(W2(μ,ν)/ε)+). Duration randomization, which has a similar effect as partial momentum refreshment, is also treated. In this case, without additional assumptions on the target distribution, the complexity of duration-randomized uHMC with sMC time integration improves to O(max⁡((d/K)1/4(L/K)3/2ε−1/2,(d/K)1/3(L/K)4/3ε−2/3))O\left(\max\left((d/K)^{1/4} (L/K)^{3/2} \varepsilon^{-1/2},(d/K)^{1/3} (L/K)^{4/3} \varepsilon^{-2/3} \right) \right)O(max((d/K)1/4(L/K)3/2ε−1/2,(d/K)1/3(L/K)4/3ε−2/3)) up to logarithmic factors. The improvement due to duration randomization turns out to be analogous to that of time integrator randomization.

View on arXiv
Comments on this paper