11
20

Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning

Abstract

We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Specifically, we show algorithms for the following problems: - Given sample access to two Bayesian networks P1P_1 and P2P_2 over known directed acyclic graphs G1G_1 and G2G_2 having nn nodes and bounded in-degree, approximate dtv(P1,P2)d_{tv}(P_1,P_2) to within additive error ϵ\epsilon using poly(n,ϵ)poly(n,\epsilon) samples and time - Given sample access to two ferromagnetic Ising models P1P_1 and P2P_2 on nn variables with bounded width, approximate dtv(P1,P2)d_{tv}(P_1, P_2) to within additive error ϵ\epsilon using poly(n,ϵ)poly(n,\epsilon) samples and time - Given sample access to two nn-dimensional Gaussians P1P_1 and P2P_2, approximate dtv(P1,P2)d_{tv}(P_1, P_2) to within additive error ϵ\epsilon using poly(n,ϵ)poly(n,\epsilon) samples and time - Given access to observations from two causal models PP and QQ on nn variables that are defined over known causal graphs, approximate dtv(Pa,Qa)d_{tv}(P_a, Q_a) to within additive error ϵ\epsilon using poly(n,ϵ)poly(n,\epsilon) samples, where PaP_a and QaQ_a are the interventional distributions obtained by the intervention do(A=a)do(A=a) on PP and QQ respectively for a particular variable AA. Our results are the first efficient distance approximation algorithms for these well-studied problems. They are derived using a simple and general connection to distribution learning algorithms. The distance approximation algorithms imply new efficient algorithms for {\em tolerant} testing of closeness of the above-mentioned structured high-dimensional distributions.

View on arXiv
Comments on this paper