为了正常的体验网站,请在浏览器设置里面开启Javascript功能!

10月9日讲义

2012-10-21 38页 ppt 220KB 22阅读

用户头像

is_412071

暂无简介

举报
10月9日讲义null中级计量经济学 INTERMEDIATE ECONOMETRICS中级计量经济学 INTERMEDIATE ECONOMETRICS(七) 简单多元回归之二 2006年10月9日 Chapter Outline 本章大纲Chapter Outline 本章大纲Motivation for Multiple Regression 使用多元回归的动因 Mechanics and Interpretation of Ordinary Least Squares 普通最小二乘法的操作和解释 The Expected V...
10月9日讲义
null中级计量经济学 INTERMEDIATE ECONOMETRICS中级计量经济学 INTERMEDIATE ECONOMETRICS(七) 简单多元回归之二 2006年10月9日 Chapter Outline 本章大纲Chapter Outline 本章大纲Motivation for Multiple Regression 使用多元回归的动因 Mechanics and Interpretation of Ordinary Least Squares 普通最小二乘法的操作和解释 The Expected Values of the OLS Estimators OLS估计量的期望值 The Variance of the OLS Estimators OLS估计量的方差 Efficiency of OLS: The Gauss-Markov Theorem OLS的有效性:高斯-马尔科夫定理Lecture Outline 课堂大纲Lecture Outline 课堂大纲The MLR.1 – MLR.4 Assumptions 假定MLR.1 – MLR.4 The Unbiasedness of the OLS estimates OLS估计值的无偏性 Over or Under specification of models 模型设定不足或过度设定 Omitted Variable Bias 遗漏变量的偏误 Sampling Variance of the OLS slope estimates OLS斜率估计量的抽样方差The expected value of the OLS estimators OLS估计量的期望值The expected value of the OLS estimators OLS估计量的期望值We now turn to the statistical properties of OLS for estimating the parameters in an underlying population model. 我们现在讨论OLS估计量的统计特性,而我们知道OLS是估计潜在的总体模型参数的估计量。 Statistical properties are the properties of estimators when random sampling is done repeatedly. It is not meaning for to talk about the statistical properties of a set of estimates obtained from a single sample. 统计性质是指当在随机抽样不断重复时,估计量所现出来的特性。因此讨论从一个样本中得到的估计值的统计特性是没有什么意义的。 “路遥知马力,日久见人心。” Assumption MLR.1 (Linear in Parameters) 假定 MLR.1(对参数而言为线性)Assumption MLR.1 (Linear in Parameters) 假定 MLR.1(对参数而言为线性)In the population model, the dependent variable y is related to the independent variable x and the error u as y= b0+ b1x1+ b2x2+ …+bkxk+u where b1, b2 …, bk are the unknown parameters of interest and u is an unobservable random error or random disturbance term. 在总体模型中,因变量y与自变量x和误差项u关系如上所示。其中, b1, b2 …, bk 为所关心的未知参数,u为不可观测的随机误差项或随机干扰项。 The population model is also called the true model, to allow for the possibility that we might estimate a model that differs from (3.31). 总体模型亦称真实模型,因此我们实际估计的模型有可能和真实模型不同。Assumption MLR.2 (Random Sampling) 假定 MLR.2(随机抽样性)Assumption MLR.2 (Random Sampling) 假定 MLR.2(随机抽样性)We can use a random sample of size n from the population, {(xi1, xi2…, xik; yi): i=1,…,n}, where i denotes observation, and j= 1,…,k denotes the jth regressor. 我们可以使用总体的一个容量为n的随机样本, {(xi1, xi2…, xik; yi): i=1,…,n}, 其中i 代表第i个观察值,j=1,…,k代表第j个回归元。 Sometimes we write 有时我们将模型写为 yi= b0+ b1xi1+ b2xi2+ …+bkxik+ui Assumptions MLR.3 假定 MLR.3 Assumptions MLR.3 假定 MLR.3MLR.3 (Zero Conditional Mean) (零条件均值) : E(u| x1, x2…, xk)=0. When this assumption holds, we say all of the explanatory variables are exogenous; when it fails, we say that the explanatory variables are endogenous. 当该假定成立时,我们称所有解释变量均为外生的;否则,我们则称解释变量为内生的。 We will pay particular attention to the case that assumption 3 fails because of omitted variables. 我们将特别注意当重要变量缺省时导致假定3不成立的情况。Assumption MLR.4 假定MLR.4 Assumption MLR.4 假定MLR.4 MLR.4 (No perfect collinearity) (不存在完全共线性) : In the sample, none of the independent variables is constant, and there are no exact linear relationships among the independent variables. 在样本中,没有一个自变量是常数,自变量之间也不存在严格的线性关系。 When one regressor is an exact linear combination of the other regressor(s), we say the model suffers from perfect collinearity. 当一个自变量是其它解释变量的严格线性组合时,我们说此模型有完全共线性。 Assumption MLR.4 假定MLR.4Assumption MLR.4 假定MLR.4Examples of perfect collinearity:完全共线性的例子: y= b0+ b1x1+ b2x2+ b3x3+u, x2 = 3x3, y= b0+ b1log(inc)+ b2log(inc2 )+u y= b0+ b1x1+ b2x2+ b3x3+ b4x4+u, x1 +x2 +x3+ x4 =1. Perfect collinearity also happens when y= b0+ b1x1+ b2x2+ …+bkxk+u , n<(k+1). 当y= b0+ b1x1+ b2x2+ b3x3+u , n<(k+1) 也发生完全共线性的情况。 Why does perfect collinearity matter? 为什么完全共线性是个问题?Assumption MLR.4 假定MLR.4Assumption MLR.4 假定MLR.4First, find answer from the perspective of ceteris paribus analysis. 首先,你可以从ceteris paribus 分析的角度考虑。 Second, from the perspective of partialling out.你可以从“排除其它变量影响”角度考虑。尤其是观察残差形式的b估计量的公式。 Third, from the implementation of OLS estimators. The denominator of the OLS estimator is 0 when there is perfect collinearity, hence the OLS estimator cannot be performed. 在完全共线性情况下,OLS估计量的分母为零,因此OLS估计量不能估算具体估计值。Assumption MLR.4 假定MLR.4Assumption MLR.4 假定MLR.4Theorem 3.1 (Unbiasedness of OLS) 定理 3.1(OLS的无偏性)Theorem 3.1 (Unbiasedness of OLS) 定理 3.1(OLS的无偏性)Under assumptions MLR.1 through MLR.4, the OLS estimators are unbiased estimator of the population parameters, that is 在假定MLR.1~MLR.4下,OLS估计量是总体参数的无偏估计量,即 Theorem 3.1 (Unbiasedness of OLS) 定理 3.1(OLS的无偏性)Theorem 3.1 (Unbiasedness of OLS) 定理 3.1(OLS的无偏性)Unbiasedness is the property of an estimator, that is, the procedure that can produce an estimate for a specific sample, not an estimate. 无偏性是估计量的特性,而不是估计值的特性。估计量是一种方法(过程),该方法使得给定一个样本,我们可以得到一组估计值。我们评价的是方法的优劣。 Not correct to say “5 percent is an unbiased estimate of the return of education”. 不正确的说法:“5%是教育汇报率的无偏估计值。”Too Many or Too Few Variables 变量太多还是太少了?Too Many or Too Few Variables 变量太多还是太少了? What happens if we include variables in our specification that don’t belong? 如果我们在设定中包含了不属于真实模型的变量会怎样? A model is overspecifed when one or more of the independent variables is included in the model even though it has no partial effect on y in the population 尽管一个(或多个)自变量在总体中对y没有局部效应,但却被放到了模型中,则此模型被过度设定。 There is no effect on the unbiasedness of the OLS estimators. But it can have undesirable effects on the variances of the OLS estimators. 过度设定对我们的参数估计没有影响,OLS仍然是无偏的。但它对OLS估计量的方差有不利影响。 Too Many or Too Few Variables 变量太多还是太少了?Too Many or Too Few Variables 变量太多还是太少了?What if we exclude a variable from our specification that does belong? 如果我们在设定中排除了一个本属于真实模型的变量会如何? If a variable that actually belongs in the true model is omitted, we say the model is underspecified. 如果一个实际上属于真实模型的变量被遗漏,我们说此模型设定不足。 OLS will usually be biased. 此时OLS通常有偏。 Deriving the bias caused by omitting an important variable is an example of misspecification analysis. 推导由遗漏重要变量所造成的偏误,是模型设定分析的一个例子。Omitted Variable Bias 遗漏变量的偏误Omitted Variable Bias 遗漏变量的偏误Omitted Variable Bias (cont) 遗漏变量的偏误(续)Omitted Variable Bias (cont) 遗漏变量的偏误(续)Omitted Variable Bias (cont) 遗漏变量的偏误(续)Omitted Variable Bias (cont) 遗漏变量的偏误(续)Omitted Variable Bias (cont) 遗漏变量的偏误(续)Omitted Variable Bias (cont) 遗漏变量的偏误(续)Omitted Variable Bias Summary 遗漏变量的偏误 总结Omitted Variable Bias Summary 遗漏变量的偏误 总结Two cases where bias is equal to zero 两种偏误为零的情形 b2 = 0, that is x2 doesn’t really belong in model b2 = 0,也就是,x2实际上不属于模型 x1 and x2 are uncorrelated in the sample 样本中x1与x2不相关 Omitted Variable Bias Summary 遗漏变量的偏误 总结Omitted Variable Bias Summary 遗漏变量的偏误 总结If correlation between x2 , x1 and x2 , y is the same direction, bias will be positive 如果x2与 x1间相关性和x2与y间相关性同方向,偏误为正。 If correlation between x2 , x1 and x2 , y is the opposite direction, bias will be negative 如果x2与 x1间相关性和x2与y间相关性反方向,偏误为负。 Summary of Direction of Bias 偏误方向总结Summary of Direction of Bias 偏误方向总结Omitted-Variable Bias 遗漏变量偏误Omitted-Variable Bias 遗漏变量偏误In general , b2 is unknown; and when a variable is omitted, it is mainly because of this variable is unobserved. In other words, we do not know the sign of Corr(x1, x2). What to do? 但是,通常我们不能观测到b2 ,而且,当一个重要变量被缺省时,主要原因也是因为该变量无法观测,换句话说,我们无法准确知道Corr(x1, x2)的符号。怎么办呢? We rely on economic theories and intuition to make a educated guess of the sign. 我们将依靠经济理论和直觉来帮助我们对相应符号做出较好的估计。Example: hourly wage equation 例子:小时工资方程Example: hourly wage equation 例子:小时工资方程Suppose the model log(wage) = b0+b1educ + b2abil +u is estimated with abil omitted. What is the direction of bias for b1? 假定模型 log(wage) = b0+b1educ + b2abil +u,在估计时遗漏了abil。 b1的偏误方向如何? Since in general ability has positive partial effect on y and ability and education years is positive corrected, we expect b1 to have a upward bias. 因为一般来说ability对y有正的局部效应,并且ability和education years正相关,所以我们预期b1上偏。The More General Case 更一般的情形The More General Case 更一般的情形 Technically, it is more difficult to derive the sign of omitted variable bias with multiple regressors. 从技术上讲,要推出多元回归下缺省一个变量时各个变量的偏误方向更加困难。 But remember that if an omitted variable has partial effects on y and it is correlated with at least one of the regressors, then the OLS estimators of all coefficients will be biased. 我们需要记住,若有一个对y有局部效应的变量被缺省,且该变量至少和一个解释变量相关,那么所有系数的OLS估计量都有偏。 The More General Case 更一般的情形The More General Case 更一般的情形The More General Case 更一般的情形The More General Case 更一般的情形An ExampleAn ExampleIn wage1.dta, Corr(wage,construc)=0.004. Reg wage construc gives an estimated coefficient of 0.07, with a t statistic of 0.09. But reg wage construc educ gives an estimated coefficient for construc 0.63, with a t statistic of 0.89. nullVariance of the OLS Estimators OLS估计量的方差 Now we know that the sampling distribution of our estimate is centered around the true parameter. Want to think about how spread out this distribution is . 现在我们知道估计值的样本分布是以真实参数为中心的。我们还想知道这一分布的分散状况。 Practically important since a larger variance means a less precise estimator, hence larger confidence intervals and less accurate hypotheses test. 在实际运用中很重要,因为大的估计量方差对应于较不准确的估计量,较大的置信区间,和较不准确的假设检验。 Need an additional assumption. 需要一个新增假设.Assumption MLR.5 (Homoskedasticity) 假定MLR.5(同方差性)Assumption MLR.5 (Homoskedasticity) 假定MLR.5(同方差性)Assume Homoskedasticity: 同方差性假定: Var(u|x1, x2,…, xk) = s2 . Means that the variance in the error term, u, conditional on the explanatory variables, is the same for all combinations of outcomes of explanatory variables. 意思是,不管解释变量出现怎样的组合,误差项u的条件方差都是一样的。 If the assumption fails, we say the model exhibits heteroskedasticity. 如果这个假定不成立,我们说模型存在异方差性。 Variance of OLS (cont) OLS估计量的方差(续)Variance of OLS (cont) OLS估计量的方差(续) Let x stand for (x1, x2,…xk) 用x表示(x1, x2,…xk) Assuming that Var(u|x) = s2 also implies that Var(y| x) = s2 假定Var(u|x) = s2,也就意味着Var(y| x) = s2 Assumption MLR.1-5 are collectively known as the Gauss-Markov assumptions. 假定MLR.1-5共同被称为高斯-马尔科夫假定 Theorem 3.2 (Sampling Variances of the OLS Slope Estimators) 定理 3.2(OLS斜率估计量的抽样方差)Theorem 3.2 (Sampling Variances of the OLS Slope Estimators) 定理 3.2(OLS斜率估计量的抽样方差)Interpreting Theorem 3.2 对定理3.2的解释Interpreting Theorem 3.2 对定理3.2的解释 Theorem 3.2 shows that the variances of the estimated slope coefficients are influenced by three factors: 定理3.2显示:估计斜率系数的方差受到三个因素的影响: The error variance 误差项的方差 The total sample variation 总的样本变异 Linear relationships among the independent variables 解释变量之间的线性相关关系Interpreting Theorem 3.2: The Error Variance 对定理3.2的解释(1):误差项方差Interpreting Theorem 3.2: The Error Variance 对定理3.2的解释(1):误差项方差A larger s2 implies a larger variance for the OLS estimators. 更大的s2意味着更大的OLS估计量方差。 A larger s2 means more noises in the equation. 更大的s2意味着方程中的“噪音”越多。 This makes it more difficult to extract the exact partial effect of the regressor on the regressand. 这使得得到自变量对因变量的准确局部效应变得更加困难。 Introducing more regressors can reduce the variance. But often this is not possible, neither is it desirable. 引入更多的解释变量可以减小方差。但这样做不仅不一定可能,而且也不一定总令人满意。 s2 does not depends on sample size. s2 不依赖于样本大小Interpreting Theorem 3.2: The total sample variation 对定理3.2的解释(2):总的样本变异Interpreting Theorem 3.2: The total sample variation 对定理3.2的解释(2):总的样本变异A larger SSTj implies a smaller variance for the estimators, and vice versa. 更大的SSTj意味着更小的估计量方差,反之亦然。 Everything else being equal, more sample variation in x is always preferred. 其它条件不变情况下, x的样本方差越大越好。 One way to gain more sample variation is to increase the sample size. 增加样本方差的一种方法是增加样本容量。 This components of parameter variance depends on the sample size. 参数方差的这一组成部分依赖于样本容量。 Interpreting Theorem 3.2: multicollinearity 对定理3.1的解释(3):多重共线性Interpreting Theorem 3.2: multicollinearity 对定理3.1的解释(3):多重共线性A larger Rj2 implies a larger variance for the estimators 更大的Rj2意味着更大的估计量方差。 A large Rj2 means other regressors can explain much of the variations in xj. 如果Rj2较大,就说明其它解释变量解释可以解释较大部分的该变量。 When Rj2 is very close to 1, xj is highly correlated with other regressors, this is called multicollinearity. 当Rj2非常接近1时, xj与其它解释变量高度相关,被称为多重共线性。 Severe multicollinearity means the variance of the estimated parameter will be very large. 严重的多重共线性意味着被估计参数的方差将非常大。Interpreting Theorem 3.2: multicollinearity 对定理3.2的解释(3):多重共线性Interpreting Theorem 3.2: multicollinearity 对定理3.2的解释(3):多重共线性Multicollinearity is a data problem. 多重共线性是一个数据问题 Could be reduced by appropriately dropping certain variables, or collecting more data, etc. 可以通过适当的地舍弃某些变量,或收集更多数据等方法来降低。 Notice that a high degree of correlation between certain independent variables can be irrelevant as to how well we can estimate other parameters in the model. 注意:虽然某些自变量之间可能高度相关,但与模型中其它参数的估计程度无关。Summary 总结Summary 总结Important points of this lecture: 本堂课重要的几点: Gauss-Markov assumptions 高斯-马尔科夫假定 What is consequence of overspecification and underspecification 模型过度设定和设定不足的后果 What is omitted-variable bias 遗漏变量偏差是什么 What are the three components of the variances of the estimated parameter and how they will affect the magnitude of the variances. 被估计参数方差的三个组成部分是什么,以及它们如何影响被估计参数方差的大小。
/
本文档为【10月9日讲义】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。

历史搜索

    清空历史搜索