Policy Gradient Approach to Compilation of Variational Quantum Circuits

Quantum (Quantum), 2021

19 November 2021

David A. Herrera-Martí

ArXiv (abs)PDF HTML Github (4932★)

Main:7 Pages

6 Figures

Bibliography:4 Pages

Appendix:5 Pages

Abstract

We propose a method for finding approximate compilations of quantum circuits, based on techniques from policy gradient reinforcement learning. The choice of a stochastic policy allows us to rephrase the optimization problem in terms of probability distributions, rather than variational parameters. This implies that searching for the optimal configuration is done by optimizing over the distribution parameters, rather than over the circuit free angles. The upshot of this is that we can always compute a gradient, provided that the policy is differentiable. We show numerically that this approach is more competitive than those using gradient-free methods, even in the presence of depolarizing noise, and argue analytically why this is the case. Another interesting feature of this approach to variational compilation is that it does not need a separate register and long-range interactions to estimate the end-point fidelity. We expect these techniques to be relevant for training variational circuit in other contexts

View on arXiv

Comments on this paper