How to Do xtabond2下载_在线阅读_48

is_061742

暂无简介

How to Do xtabond2 Electronic copy of this paper is available at: http://ssrn.com/abstract=982943 Working Paper Number 103 December 2006 How to Do xtabond2: An Introduction to “Difference” and “System” GMM in Stata By David Roodman Abstract The Arellano-...

Electronic copy of this paper is available at: http://ssrn.com/abstract=982943 Working Paper Number 103 December 2006 How to Do xtabond2: An Introduction to “Difference” and “System” GMM in Stata By David Roodman Abstract The Arellano-Bond (1991) and Arellano-Bover (1995)/Blundell-Bond (1998) linear generalized method of moments (GMM) estimators are increasingly popular. Both are general estimators designed for situations with “small T, large N” panels, meaning few time periods and many individuals; with independent variables that are not strictly exogenous, meaning correlated with past and possibly current realizations of the error; with fixed effects; and with heteroskedasticity and autocorrelation within individuals. This pedagogic paper first introduces linear GMM. Then it shows how limited time span and the potential for fixed effects and endogenous regressors drive the design of the estimators of interest, offering Stata-based examples along the way. Next it shows how to apply these estimators with xtabond2. It also explains how to perform the Arellano-Bond test for autocorrelation in a panel after other Stata commands, using abar. The Center for Global Development is an independent think tank that works to reduce global poverty and inequality through rigorous research and active engagement with the policy community. Use and dissemination of this Working Paper is encouraged, however reproduced copies may not be used for commercial purposes. Further usage is permitted under the terms of the Creative Commons License. The views expressed in this paper are those of the author and should not be attributed to the directors or funders of the Center for Global Development. www.cgdev.org ________________________________________________________________________ Electronic copy of this paper is available at: http://ssrn.com/abstract=982943 How to Do xtabond2: An Introduction to “Difference” and “System” GMM in Stata1 David Roodman December 2006, revised January 2007 1Research Fellow, Center for Global Development. I thank Manuel Arellano, Christopher Baum, Michael Clemens, Francisco Ciocchini, Decio Coviello, Mead Over, and Mark Schaffer for comments. And I thank all the users whose feedback has led to steady improvement in xtabond2. Address for correspondence: droodman@cgdev.org Foreword The Center for Global Development builds its policy recommendations on both theoretical and empirical analysis. The empirical analysis often draws on historical, non- experimental data, drawing inferences that rely partly on good judgment and partly on the considered application of the most sophisticated and appropriate econometric techniques available. This paper provides an introduction to econometric techniques that are specifically designed to extract causal lessons from data on a large number of individuals (whether countries, firms, or people) each of which is observed only a few times, such as annually over five or ten years. These techniques were developed in the 1990s by authors such as Manuel Arellano, Richard Blundell, and Olympia Bover, and have been widely applied to estimate every thing from the impact of foreign aid to the importance of financial sector development to the effects of AIDS deaths on households. The present paper contributes to this literature pedagogically, by providing an original synthesis and exposition of the literature on these “dynamic panel estimators,” and practically, by presenting the first implementation of some of these techniques in Stata, a statistical software package widely used in the research community. Stata is designed to encourage users to develop new commands for it, which other users can then use or even modify. David Roodman’s xtabond2, introduced here, is now one of the most frequently downloaded user-written Stata commands in the world. Stata’s partially open- source architecture has encouraged the growth of a vibrant world-wide community of researchers, which benefits not only from improvements made to Stata by the parent corporation, but also from the voluntary contributions of other users. Stata is arguably one of the best examples of a combination of private for-profit incentives and voluntary open-source incentives in the joint creation of a global public good. The Center for Global Development is pleased to contribute this paper and two commands, called xtabond2 and abar, to the research community. Nancy Birdsall President Center for Global Development Abstract The “difference” and “system” generalized method of moments (GMM) estimators, developed by Holtz- Eakin, Newey, and Rosen (1988), Arellano and Bond (1991), Arellano and Bover (1995), and Blundell and Bond (1998), are increasingly popular. Both are general estimators designed for situations with “small T , large N” panels, meaning few time periods and many individuals; with independent variables that are not strictly exogenous, meaning correlated with past and possibly current realizations of the error; with fixed effects; and with heteroskedasticity and autocorrelation within individuals. This pedagogic paper first introduces linear GMM. Then it shows how limited time span and the potential for fixed effects and endogenous regressors drive the design of the estimators of interest, offering Stata-based examples along the way. Next it shows how to apply these estimators with xtabond2. It also explains how to perform the Arellano-Bond test for autocorrelation in a panel after other Stata commands, using abar. The paper closes with some tips for proper use. 1 Introduction The Arellano-Bond (1991) and Arellano-Bover (1995)/Blundell-Bond (1998) dynamic panel estimators are increasingly popular. Both are general estimators designed for situations with 1) “small T , large N” panels, meaning few time periods and many individuals; 2) a linear functional relationship; 3) a single left-hand-side variable that is dynamic, depending on its own past realizations; 4) independent variables that are not strictly exogneous, meaning correlated with past and possibly current realizations of the error; 5) fixed individual effects; and 6) heteroskedasticity and autocorrelation within individuals, but not across them. Arellano- Bond estimation starts by transforming all regressors, usually by differencing, and uses the Generalized Method of Moments (Hansen 1982), and so is called “difference GMM.”footnoteAs we will discuss, the forward orthogonal deviations transform, proposed by Arellano and Bover (1995), is sometimes performed instead of differencing. The Arellano-Bover/Blundell-Bond estimator augments Arellano-Bond by making an additional assumption, that first differences of instrumenting variables are uncorrelated with the fixed effects. This allows the introduction of more instruments, and can dramatically improve efficiency. It builds a system of two equations—the original equation as well as the transformed one—and is known as “system GMM.” The program xtabond2 implements these estimators. It has some important advantages over Stata’s built-in xtabond. It implements system GMM. It can make the Windmeijer (2005) finite-sample correction to the reported standard errors in two-step estimation, without which those standard errors tend to be severely downward biased. It offers forward orthogonal deviations, an alternative to differencing that preserves sample size in panels with gaps. And it allows finer control over the instrument matrix. Interestingly, though the Arellano and Bond paper is now seen as the source of an estimator, it is en- titled, “Some Tests of Specification for Panel Data.” The instrument sets and use of GMM that largely define difference GMM originated with Holtz-Eakin, Newey, and Rosen (1988). One of Arellano and Bond’s contributions is a test for autocorrelation appropriate for linear GMM regressions on panels, which is espe- cially important when lags are used as instruments. xtabond2, like xtabond, automatically reports this test. But since ordinary least squares (OLS) and two-stage least squares (2SLS) are special cases of linear GMM, the Arellano-Bond test has wider applicability. The post-estimation command abar, also introduced in this paper, makes the test available after regress, ivreg, ivreg2, newey, and newey2. One disadvantage of difference and system GMM is that they are complicated and can easily generate invalid estimates. Implementing them with a Stata command stuffs them into a black box, creating the risk that users, not understanding the estimators’ purpose, design, and limitations, will unwittingly misuse 1 them. This paper aims to prevent that. Its approach is therefore pedagogic. Section 2 introduces linear GMM. Section 3 describes the problem these estimators are meant to solve, and shows how that drives their design. A few of the more complicated derivations in those sections are intentionally incomplete since their purpose is to build intuitions; the reader must refer to the original papers for details. Section 4 explains the xtabond2 and abar syntaxes, with examples. Section 5 concludes with a few tips for good practice. 2 Linear GMM1 2.1 The GMM estimator The classic linear estimators, Ordinary Least Squares (OLS) and Two-Stage Least Squares (2SLS), can be thought of in several ways, the most intuitive being suggested by the estimators’ names. OLS minimizes the sum of the squared errors. 2SLS can be implemented via OLS regressions in two stages. But there is another, more unified way to view these estimators. In OLS, identification can be said to flow from the assumption that the regressors are orthogonal to the errors; in other words, the inner products, or moments of the regressors with the errors are set to 0. In the more general 2SLS framework, which distinguishes between regressors and instruments while allowing the two categories to overlap (variables in both categories are included, exogenous regressors), the estimation problem is to choose coefficients on the regressors so that the moments of the errors with the instruments are again 0. However, an ambiguity arises in conceiving of 2SLS as a matter of satisfying such moment conditions. What if there are more instruments than regressors? If equations (moment conditions) outnumber variables (parameters), the conditions cannot be expected to hold perfectly in finite samples even if they are true asymptotically. This is the sort of problem we are interested in. To be precise, we want to fit the model: y = x′β + ε E[zε] = 0 E[ε|z] = 0 where β is a column of coefficients, y and ε are random variables, x = [x1 . . .xk] ′ is a column of k regressors, z = [z1 . . . zj ] ′ is column of j instruments, x and z may share elements, and j ≥ k. We use X, Y, and 1For another introduction to GMM, see Baum, Schaffer, and Stillman (2003). For a full account, see Ruud (2000, chs. 21–22). Both sources greatly influence this account. 2 Z to represent matrices of N observations for x, y, and z, and define E = Y − Xβ. Given an estimate βˆ, the empirical residuals are Eˆ = [eˆ1 . . . eˆN ] ′ = Y − Xβˆ. We make no assumption at this point about E [EE′|Z] ≡ Ω except that it exists. The challenge in estimating this model is that while all the instruments are theoretically orthogonal to the error term (E[zε] = 0), trying to force the corresponding vector of empirical moments, EN [zε] = 1NZ ′Eˆ, to zero creates a system with more equations than variables if instruments outnumber parameters. The specification is overidentified. Since we cannot expect to satisfy all the moment conditions at once, the problem is to satisfy them all as well as possible, in some sense, that is, to minimize the magnitude of the vector EN [zε]. In the Generalized Method of Moments, one defines that magnitude through a generalized metric, based on a positive semi-definite quadratic form. Let A be the matrix for such a quadratic form. Then the metric is: ‖EN [zε]‖A = ∥∥∥∥ 1N Z′Eˆ ∥∥∥∥ A ≡ N ( 1 N Z′Eˆ )′ A ( 1 N Z′Eˆ ) = 1 N Eˆ′ZAZ′Eˆ. (1) To derive the implied GMM estimate, call it βˆA, we solve the minimization problem βˆA = argminβˆ ∥∥∥Z′Eˆ∥∥∥ A , whose solution is determined by 0 = d dβˆ ∥∥∥Z′Eˆ∥∥∥ A . Expanding this derivative with the chain rule gives: 0 = d dβˆ ∥∥∥Z′Eˆ∥∥∥ A = d dEˆ ∥∥∥Z′Eˆ∥∥∥ A dEˆ dβˆ = d dEˆ ( 1 N Eˆ′ ( ZAZ′ ) Eˆ ) d(Y −Xβˆ) dβˆ = 2 N Eˆ′ZAZ′ (−X) . The last step uses the matrix identities dAb/db = A and d (b′Ab)/db = 2b′A, where b is a column vector and A a symmetric matrix. Dropping the factor of −2/N and transposing, 0 = Eˆ′ZAZ′X = ( Y −XβˆA )′ ZAZ′X = Y′ZAZ′X− βˆ′AX′ZAZ′X ⇒ X′ZAZ′Xβˆ′A = X′ZAZ′Y ⇒ βˆ = (X′ZAZ′X)−1X′ZAZ′Y (2) This is the GMM estimator implied by A. It is linear in Y. The estimator is consistent, meaning that it converges in probability to β as sample size goes to infinity (Hansen 1982). But it is not in general unbiased, as subsection 2.6 discusses, because in finite samples the instruments are not in general perfectly uncorrelated with the endogenous components of the instrumented 3 regressors (correlation coefficients between finite samples of uncorrelated variables are usually not exactly 0). For future reference, we note that the error of the estimator is the corresponding projection of the true model errors: βˆA − β = ( X′ZAZ′X )−1 X′ZAZ′ (Xβ +E)− β = ( X′ZAZ′X )−1 X′ZAZ′Xβ + ( X′ZAZ′X )−1 X′ZAZ′E− β = ( X′ZAZ′X )−1 X′ZAZ′E. (3) 2.2 Efficiency It can be seen from (2) that multiplying A by a non-zero scalar would not change βˆA. But up to a factor of proportionality, each choice of A implies a different linear, consistent estimator of β. Which A should the researcher choose? Setting A = I, the identity matrix, is intuitive, generally inefficient, and instructive. By (1) it would yield an equal-weighted Euclidian metric on the moment vector. To see the inefficiency, consider what happens if there are two mean-zero instruments, one drawn from a variable with variance 1, the other from a variable with variance 1,000. Moments based on the second would easily dominate under equal weighting, wasting the information in the first. Or imagine a cross-country growth regression instrumenting with two highly correlated proxies for the poverty level. The marginal information content in the second would be minimal, yet including it in the moment vector would essentially double the weight of poverty relative to other instruments. Notice that in both these examples, the inefficiency would theoretically be signaled by high variance or covariance among moments. This suggests that making A scalar is inefficient unless the moments 1N z ′ iE have equal variance and are uncorrelated—that is, if Var [Z ′E] is itself scalar. This is in fact the case, as will be seen.2 But that negative conclusion hints at the general solution. For efficiency,Amust in effect weight moments in inverse proportion to their variances and covariances. In the first example above, such reweighting would appropriately deemphasize the high-variance instrument. In the second, it would efficiently down-weight one or both of the poverty proxies. In general, for efficiency, we weight by the inverse of the variance matrix of the moments: AEGMM = Var [Z′E] −1 = (Z′Var [E|Z]Z)−1 = (Z′ΩZ)−1 . (4) 2This argument is identical to that for the design of Generalized Least Squares, except that GLS is derived with reference to the errors E where GMM is derived with reference to the moments Z’E. 4 The “EGMM” stands for “efficient GMM.” The EGMM estimator minimizes ∥∥∥Z′Eˆ∥∥∥ AEGMM = 1 N ( Z′Eˆ )′ Var [Z′E]−1 Z′Eˆ Substituting this choice of A into (2) gives the direct formula for efficient GMM: βˆEGMM = ( X′Z (Z′ΩZ)−1 Z′X )−1 X′Z (Z′ΩZ)−1 Z′Y (5) Efficient GMM is not feasible, however, unless Ω is known. Before we move to making the estimator feasible, we demonstrate its theoretical efficiency. Let B be the vector space of linear, scalar-valued functions of the random vector Y. This space contains all the coefficient estimates flowing from linear estimators based on Y. For example, if c = (1 0 0 . . .) then cβˆA ∈ B is the estimated coefficient for x1 according to the GMM estimator implied by some A. We define an inner product on B by 〈b1, b2〉 = Cov [b1, b2]; the corresponding metric is ‖b‖2 = Var [b]. The assertion that (5) is efficient is equivalent to saying that for any row vector c, the variance of the corresponding combination of coefficients from an estimate, ∥∥∥cβˆA∥∥∥, is smallest when A = AEGMM. In order to demonstrate that, we first show that 〈 cβˆA, cβˆAEGMM 〉 is invariant in the choice of A. We start with the definition of the covariance matrix and substitute in with (3) and (4): 〈 cβˆA, cβˆAEGMM 〉 = Cov [ cβˆA, cβˆAGMM ] = Cov [ c ( X′ZAZ′X )−1 X′ZAZ′Y, c (X′ZAEGMMZ′X) −1X′ZAEGMMZ′Y ] = c ( X′ZAZ′X )−1 X′ZAZ′ E [ EE′ ∣∣Z]Z (Z′ΩZ)−1 Z′X(X′Z (Z′ΩZ)−1 Z′X)−1 c′ = c ( X′ZAZ′X )−1 X′ZAZ′ΩZ (Z′ΩZ)−1 Z′X ( X′Z (Z′ΩZ)−1 Z′X )−1 c′ = c ( X′ZAZ′X )−1 X′ZAZ′X ( X′Z (Z′ΩZ)−1 Z′X )−1 c′ = c ( X′Z (Z′ΩZ)−1 Z′X )−1 c′. This does not depend onA. As a result, for anyA, 〈 cβˆAEGMM , c ( βˆAEGMM − βˆA )〉 = 〈 cβˆAEGMM , cβˆAEGMM 〉 −〈 cβˆAEGMM , cβˆA 〉 = 0. That is, the difference between any linear GMM estimator and the EGMM estimator is orthogonal to the latter. By the Pythagorean Theorem, ∥∥∥cβˆA∥∥∥2 = ∥∥∥cβˆA − cβˆAEGMM∥∥∥2+∥∥∥cβˆAEGMM∥∥∥2 ≥∥∥∥cβˆAEGMM∥∥∥2, which suffices to prove the assertion. This result is akin to the fact if there is a ball in midair, the point on the ground closest to the ball (analogous to the efficient estimator) is the one such that the 5 vector from the point to the ball is perpendicular to all vectors from the point to other spots on the ground (which are all inferior estimators of the ball’s position). Perhaps greater insight comes from a visualization based on another derivation of efficient GMM. Under the assumptions in our model, a direct OLS estimate of Y = Xβ+E is biased. However, taking Z-moments of both sides gives Z′Y = Z′Xβ + Z′E, (6) which is asymptotically amenable to OLS, since the regressors, Z′X, are now orthogonal to the errors: E [ (Z′X)′ Z′E ] = (Z′X)′ E [Z′E|Z] = 0 (Holtz-Eakin, Newey, and Rosen 1988). Still, though, OLS is not in general efficient on the transformed equation, since the errors are not i.i.d.—Var [Z′E] = Z′ΩZ, which cannot be assumed scalar. To solve this problem, we transform the equation again: (Z′ΩZ)−1/2 Z′Y = (Z′ΩZ)−1/2 Z′Xβ + (Z′ΩZ)−1/2 Z′E. (7) Defining X∗ = (Z′ΩZ)−1/2 Z′X, Y∗ = (Z′ΩZ)−1/2 Z′Y, and E∗ = (Z′ΩZ)−1/2 Z′E, the equation becomes Y∗ = X∗β +E∗. (8) Since Var [E∗|Z] = (Z′ΩZ)−1/2 Z′Var [E|Z]Z (Z′ΩZ)−1/2 = (Z′ΩZ)−1/2 Z′ΩZ (Z′ΩZ)−1/2 = I. this version has spherical errors. So the Gauss-Markov Theorem guarantees the efficiency of OLS on (8), which is, by definition, Generalized Least Squares on (6): βˆGLS = ( X∗ ′ X∗ )−1 X∗ ′ Y∗. Unwinding with the definitions of X∗ and Y∗ yields efficient GMM, just as in (5). Efficient GMM, then, is GLS on Z-moments. Where GLS projects Y into the column space of X, GMM estimators, efficient or otherwise, project Z′Y into the column space of Z′X. These projections also map the variance ellipsoid of Z′Y, namely Z′ΩZ, which is also the variance ellipsoid of the moments, into the column space of Z′X. If Z′ΩZ happens to be spherical, then the efficient projection is orthogonal, by Gauss-Markov, just as the shadow of a soccer ball is smallest when the sun is directly overhead. But if the variance ellipsoid of the moments is an American football pointing at an odd angle, as in the examples at the beginning of this subsection—if Z′ΩZ is not spherical—then the efficient projection, the one casting the smallest shadow, is 6 angled. To make that optimal

本文档为【How to Do xtabond2】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。

How to Do xtabond2

热门搜索

历史搜索