屡见不鲜的一类Wrong工具变量——组均值

在一些国内外期刊上,我们都能经常看到一些作者在处理内生性问题时,使用组均值(不包含个体)作为变量的工具变量。
其中,表示组,表示第组样本数量。
这样做的理由是:(1)组内个体的特征会受到组内其他个体的平均值或加总特征影响,即满足工具变量的相关性条件。(2)组内其他个体的平均或加总特征不直接影响个体的结果,即满足工具变量的外生性条件。
两个例子:

本文选取相同行业同年度内其他公司的社会责任报告净正面语调的均值TONE_meant 作为工具变量进行2SLS回归,解决内生性处理。回归结果见表5。这个工具变量满足相关性和外生性的要求。从相关性来看,同行业的公司面临相似的外部环境和行业特征,因此,他们的社会责任报告语调具有一定的相关性,并且从表5第(1)列显示,社会责任报告语调的行业均值TONE_meant 的符号在1%的水平上显著为正,故满足相关性原则。且没有证据表明其他行业公司的社会责任报告语调会影响本公司的股价崩盘风险,所以满足外生性的原则。

——摘自《审计与经济研究》某篇论文

处于同一个城市和行业的企业可能在当地的经理人市场上争夺企业家人才, 企业是否为管理层提供薪酬激励计划需参考当地的竞争对手提供的激励薪酬, 而竞争对手提供的激励薪酬不应对本企业的创新产生直接的影响。我们参照Fisman&Svensson(2007) 的方法用CEO 持股(CEO Share) 、激励薪酬(Incentive) 、利润激励(Profit Inc) 和销售量激励(Sales Inc) 的区域-行业平均值作为对应变量的工具变量。
——摘自《经济研究》某篇论文
这类工具变量屡见不鲜,甚至广为流传,以致初学者纷纷效法,他们的解释听起来似乎很有道理,然并卵,也就只能糊弄糊弄外行。关于这类工具变量的是非曲直,早有论文进行了定论,详见 Gormley and Matsa(2014)发表在RFS上的论文《Common Errors: How to (and Not to) Control for Unobserved Heterogeneity》,我在这里摘取了原文中的几段内容,以供大家学习。

Using independent variable group averages as instrumental variables. Independent variables’ group averages are also sometimes used as instrumental variables. Specifically, the researcher instruments for a potentially endogenous regressor using the regressor’s group average, , calculated excluding the observation at hand. The typical justification for such instruments is that the group average of X is correlated with but is not otherwise related to the dependent variable, . For example, a researcher estimating the impact of ROA on leverage but concerned that financial constraints introduce a simultaneity bias might propose using industry ROA to instrument for firm ROA.

Using group averages of the independent variables as instrumental variables, however, leads to inconsistent estimates in the presence of unobserved grouplevel heterogeneity, as in Equation (1). The instrument violates the exclusion restriction whenever the unobserved heterogeneity, , is correlated with the independent variable, , because is then necessarily also correlated with . As noted earlier, such correlations are pervasive in practice. In this example, unobserved industry investment opportunities likely affect both ROA and leverage, making the proposed IV estimator inconsistent.

Unlike the other applications discussed in this section, the problem with the IV estimation cannot be solved by adding fixed effects to the estimating equation. Although fixed effects control for the unobserved heterogeneity , , in the second stage estimation, the fixed effects reintroduce the endogeneity problem in the first stage estimation. Recall that the instrument, , is just the group mean excluding the observation at hand. After controlling for industry fixed effects, the instrument becomes which is perfectly correlated with the endogenous regressor, . Put differently, the instrument exploits strictly industry-level variation, which is not well-identified in the presence of industry fixed effects. For a group average instrument to be valid, the independent variable, , must be correlated with its group mean and the underlying economic source of this correlation must be unrelated to (the part of the industry variation that affects ). Although it is possible that there exist scenarios where these conditions hold, examples are rare. Researchers should not assume these conditions hold absent a strong economic justification.

——以上摘取自Gormley and Matsa(2014)论文

可能有一些朋友看到英文就头疼,我就给大家大致翻译一下这几段内容。为什么不建议大家使用组均值这类工具变量呢?因为作为使用组均值作为工具变量,通常都不满足外生性的要求,这会导致IV估计是非一致的。例如,我们想要研究企业ROA对杠杆率的影响,那么就不可避免地需要解决遗漏变量和双向因果所导致的内生性问题,如果我们使用行业ROA均值(不包含企业)作为企业ROA(变量)的工具变量,那么就会存在如下致命的问题:通常一个行业内的企业的ROA都会受到行业固定效应的影响,这就会导致解释变量企业ROA与行业固定效应相关,行业ROA均值(不包含企业)也必然会与行业固定效应相关,在没有控制行业固定效应的情况下,行业ROA均值就会与扰动项相关,这时行业ROA均值并不满足工具变量的外生性要求。
如果在模型中加入行业固定效应,也不能解决这一问题。因为行业ROA均值(不包含企业)与行业固定效应几乎是共线的,他们的区别只在于,计算时不包含(企业的ROA)。在控制了行业固定效应后,能提供的新信息只有,也就是(企业的ROA),与内生变量完全重合。在这种情况下,就完全没有存在的意义,可谓是“毫无卵用”。

参考资料

[1]Todd A. Gormley, David A. Matsa. Common Errors: How to (and Not to) Control for Unobserved Heterogeneity[J]. The Review of Financial Studies, 2014, 27(2):617–661.

[2]邱嘉平. 因果推断实用计量方法[M].上海:上海财经大学出版社, 2020.

(0)

相关推荐