因果推断中遗漏不可观测变量多严重? 通过可观测变量检测
邮箱:econometrics666@sina.cn
所有计量经济圈方法论丛的code程序, 宏微观数据库和各种软件都放在社群里.欢迎到计量经济圈社群交流访问.
今天,因果推断研究小组主要引荐以下三篇文章。他们的主题都是围绕因果推断而展开,作者都是Altonji和Elder。在平常政策评估或者其他回归过程中,尤其是在截面数据中,总有一些不可观测变量会同时影响核心解释变量X和结果变量Y。这会导致系数估计偏误,即此时的X与误差项实际上是相关了,从而不能直接将Y的变化归因到X的变化上。
此时,咱们一般的做法,是尽量找那些不可观测变量的Proxy变量。尽管不能完全控制所有变量,但咱们尽量控制住他们也能让审稿人更信服你的结论。而今天要引荐的,是想通过一些方法告诉你,不可观测变量导致的估计偏误到底有多严重?如果不可观测变量导致的偏误并不严重,或者说在已有可观测变量基础上再出现不可观测变量导致的偏误的可能性较低,此时X对Y的影响就更趋向于因果关系。
第一篇文章,作者先表明一般的IV其实都有很大问题的。
Several previous studies have relied on religious affiliation and the proximity to Catholic schools as exogenous sources of variation for identifying the effect of Catholic schooling on a wide variety of outcomes. Using three separate approaches, we examine the validity of these instrumental variables. We find that none of the candidate instruments is a useful source of identification of the Catholic school effect, at least in currently available data sets.
第二篇文章,作者强调可以通过可观测变量的回归来评估不可观测变量所导致的估计偏误问题严重性。
In this paper we measure the effect of Catholic high school attendance on educational attainment and test scores. Because we do not have a good instrumental variable for Catholic school attendance, we develop new estimation methods based on the idea that the amount of selection on the observed explanatory variables in a model provides a guide to the amount of selection on the unobservables. We also propose an informal way to assess selectivity bias based on measuring the ratio of selection on unobservables to selection on observables that would be required if one is to attribute the entire effect of Catholic school attendance to selection bias. We use our methods to estimate the effect of attending a Catholic high school on a variety of outcomes. Our main conclusion is that Catholic high schools substantially increase the probability of graduating from high school and, more tentatively, attending college. We find little evidence of an effect on test scores.
第三篇文章,作者进一步提炼出纯粹的计量方法来做第二篇文章中的检验。
We develop new estimation methods for estimating causal effects based on the idea that the amount of selection on the observed explanatory variables in a model provides a guide to the amount of selection on the unobservables. We discuss two approaches, one of which involves the use of a factor model as a way to infer properties of unobserved covariates from the observed covariates. We construct an interval estimator that asymptotically covers the true value of the causal effect, and we propose related confidence regions that cover the true value with fixed probability.
下面这篇QJE上的文章也使用了上面的计量方法。