On the Parameterized Complexity of Polytree Learning
- TPM
A Bayesian network is a directed acyclic graph that represents statistical dependencies between variables of a joint probability distribution. A fundamental task in data science is to learn a Bayesian network from observed data. \textsc{Polytree Learning} is the problem of learning an optimal Bayesian network that fulfills the additional property that its underlying undirected graph is a forest. In this work, we revisit the complexity of \textsc{Polytree Learning}. We show that \textsc{Polytree Learning} can be solved in time where is the number of variables and is the total instance size. Moreover, we consider the influence of the number of variables that might receive a nonempty parent set in the final DAG on the complexity of \textsc{Polytree Learning}. We show that \textsc{Polytree Learning} has no -time algorithm, unlike Bayesian network learning which can be solved in time. We show that, in contrast, if and the maximum parent set size are bounded, then we can obtain efficient algorithms.
View on arXiv