8
1

Policy Gradient Optimal Correlation Search for Variance Reduction in Monte Carlo simulation and Maximum Optimal Transport

Abstract

We propose a new algorithm for variance reduction when estimating f(XT)f(X_T) where XX is the solution to some stochastic differential equation and ff is a test function. The new estimator is (f(XT1)+f(XT2))/2(f(X^1_T) + f(X^2_T))/2, where X1X^1 and X2X^2 have same marginal law as XX but are pathwise correlated so that to reduce the variance. The optimal correlation function ρ\rho is approximated by a deep neural network and is calibrated along the trajectories of (X1,X2)(X^1, X^2) by policy gradient and reinforcement learning techniques. Finding an optimal coupling given marginal laws has links with maximum optimal transport.

View on arXiv
Comments on this paper