Efficient Computation of Expectations under Spanning Tree Distributions

Transactions of the Association for Computational Linguistics (TACL), 2020

29 August 2020

Abstract

We propose a general framework for computing expectations in edge-factored, non-projective spanning-tree models. Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms. We motivate the development of our framework with several \emph{cautionary tales} of previous research, which has developed numerous inefficient algorithms for computing expectations and their gradients. We demonstrate how our framework efficiently computes several quantities with known algorithms, including the Shannon entropy, the expected attachment score, and the generalized expectation criterion. As a bonus, we give algorithms for quantities that are missing in the literature, including the gradient of entropy, the KL divergence, and the gradient of the KL divergence. In all cases, our approach matches the efficiency of existing algorithms and, in several cases, reduces the runtime complexity by a factor of the sentence length. We validate our framework through rigorous proofs of correctness and efficiency.

View on arXiv

Comments on this paper