66
15

Causal Discovery Using Proxy Variables

Abstract

Discovering causal relations is fundamental to reasoning and intelligence. In particular, observational causal discovery algorithms estimate the cause-effect relation between two random entities XX and YY, given nn samples from P(X,Y)P(X,Y). In this paper, we develop a framework to estimate the cause-effect relation between two static entities xx and yy: for instance, an art masterpiece xx and its fraudulent copy yy. To this end, we introduce the notion of proxy variables, which allow the construction of a pair of random entities (A,B)(A,B) from the pair of static entities (x,y)(x,y). Then, estimating the cause-effect relation between AA and BB using an observational causal discovery algorithm leads to an estimation of the cause-effect relation between xx and yy. For example, our framework detects the causal relation between unprocessed photographs and their modifications, and orders in time a set of shuffled frames from a video. As our main case study, we introduce a human-elicited dataset of 10,000 pairs of casually-linked pairs of words from natural language. Our methods discover 75% of these causal relations. Finally, we discuss the role of proxy variables in machine learning, as a general tool to incorporate static knowledge into prediction tasks.

View on arXiv
Comments on this paper