Agglomerative Hierarchical Clustering for Selecting Valid Instrumental
Variables
We propose a novel procedure, which combines agglomerative hierarchical clustering and a test of overidentifying restrictions for selecting valid instrumental variables (IV) from a large set of candidate IVs. Some of these IVs may be invalid in the sense that they fail the exclusion restriction. We show that if the largest group of IVs is valid, our method achieves oracle properties. Compared with existing methods, our method can deal with weak instruments, multiple endogenous regressors and heterogeneous treatment effects. In simulations, we show that our method outperforms the two closest methods, the Hard Thresholding method and the Confidence Interval method.l the instruments are strong. Also, our method works well when some of the candidate instruments are weak, outperforming HT and CIM. We apply our method to the estimation of the effect of immigration on wages in the US.
View on arXiv