Causal Inference with Invalid Instruments: Exploring Nonlinear Treatment Models with Machine Learning
- CML

We discuss causal inference for observational studies with possibly invalid instrumental variables. We propose a novel methodology called two-stage curvature identification (TSCI), which explores the nonlinear treatment model with machine learning and adjusts for different forms of violating the instrumental variable assumptions. The success of TSCI requires the instrumental variable's effect on treatment to differ from its violation form. A novel bias correction step is implemented to remove bias resulting from potentially high complexity of machine learning. Our proposed TSCI estimator is shown to be asymptotically unbiased and normal even if the machine learning algorithm does not consistently estimate the treatment model. We design a data-dependent method to choose the best among several candidate violation forms. We apply TSCI to study the effect of education on earnings.
View on arXiv