Some pitfalls in causal inference with spatial observations
Abstract:
The strong trends and autocorrelation that characterize spatial data can make causal inference techniques prone to spurious significance regardless of the validity of their identification strategy. We demonstrate this using two diagnostics based on spatial noise: a placebo that gives both a treatment significance level and a Monte Carlo test of the validity of the inference procedure; and a synthetic outcome test of the null that the outcome is noise independent of the treatment. While the current practice of clustering longitudinal data by unit leads to inflated t-statistics, accurate results are easily obtained by adding a spatiotemporal basis that optimally controls for long range correlations and unit-time interactions; and applying large cluster inference. Similar procedures apply to instrumental variables where the noise diagnostics for the first stage regression offer a direct test of instrument strength. The problems of spatial regression discontinuities appear, however, to be intractable. Some benchmark studies are analyzed.