35
1

Coupling without Communication and Drafter-Invariant Speculative Decoding

Abstract

Suppose Alice has a distribution PP and Bob has a distribution QQ. Alice wants to draw a sample aPa\sim P and Bob a sample bQb \sim Q such that a=ba = b with as high of probability as possible. It is well-known that, by sampling from an optimal coupling between the distributions, Alice and Bob can achieve Pr[a=b]=1DTV(P,Q)\Pr[a = b] = 1 - D_{TV}(P,Q), where DTV(P,Q)D_{TV}(P,Q) is the total variation distance between PP and QQ. What if Alice and Bob must solve this same problem \emph{without communicating at all?} Perhaps surprisingly, with access to public randomness, they can still achieve Pr[a=b]1DTV(P,Q)1+DTV(P,Q)12DTV(P,Q)\Pr[a = b] \geq \frac{1 - D_{TV}(P,Q)}{1 + D_{TV}(P,Q)} \geq 1-2D_{TV}(P,Q) using a simple protocol based on the Weighted MinHash algorithm. This bound was shown to be optimal in the worst-case by [Bavarian et al., 2020]. In this work, we revisit the communication-free coupling problem. We provide a simpler proof of the optimality result from [Bavarian et al., 2020]. We show that, while the worst-case success probability of Weighted MinHash cannot be improved, an equally simple protocol based on Gumbel sampling offers a Pareto improvement: for every pair of distributions P,QP, Q, Gumbel sampling achieves an equal or higher value of Pr[a=b]\Pr[a = b] than Weighted MinHash. Importantly, this improvement translates to practice. We demonstrate an application of communication-free coupling to \emph{speculative decoding}, a recent method for accelerating autoregressive large language models [Leviathan, Kalman, Matias, ICML 2023]. We show that communication-free protocols can be used to contruct \emph{\CSD{}} schemes, which have the desirable property that their output is fixed given a fixed random seed, regardless of what drafter is used for speculation. In experiments on a language generation task, Gumbel sampling outperforms Weighted MinHash. Code is available atthis https URL.

View on arXiv
Comments on this paper