1
Econometrics
第7讲 模型选择:标准与
检验
2011/6/7 2
第7讲模型选择:标准与检验
1. “好的”模型具有的特性
2. 设定误差:类型及后果
3. 设定误差:诊断及补救
2011/6/7 3
1.“好的”模型:标准(Harvey)
I. 简约性(parsimony):简单优于复杂
II. 可识别性(identifiability):参数估计值存在且唯一
III. 拟合优度(goodness of fit):比如R2
IV. 理论一致性(theoretical consistency):符合理论背景
V. 预测能力(predictive power):比较预测值与实际值
2011/6/7 4
2.模型设定误差(specification errors):类型
关于解释变量选取的偏误
漏选相关变量
多选无关变量
关于模型函数形式选取的偏误
度量误差
2011/6/7 5
例7-1:U.S进口商品支出(Y)与个人可支配收入(X)
2686.300439.000019872066.600259.40001977
2640.900412.300019862001.000229.90001976
2542.800367.900019851931.700187.90001975
2469.800351.100019841896.900211.80001974
2331.900282.200019831916.300218.20001973
2261.500249.500019821797.400190.70001972
2248.600258.700019811728.400166.20001971
2214.300253.600019801668.100150.90001970
2212.600277.900019791599.800144.60001969
2167.400274.100019781551.300135.70001968
XYobsXYobs
2011/6/7 6
DL=1.100;DU=1.5370.000000Prob(F-statistic)
1.363316Durbin-Watson stat371.3761F-statistic
8.219718Hannan-Quinn criter.-78.90561Log likelihood
8.339921Schwarz criterion3128.836Sum squared resid
8.190561Akaike info criterion13.56647S.E. of regression
85.78792S.D. dependent var0.974992Adjusted R-squared
253.0800Mean dependent var0.977624R-squared
0.0000-5.4316014.270415-23.19519@TREND
0.00008.6803110.0745360.647000X
0.0000-7.602172116.1664-883.1170C
Prob.t-StatisticStd. ErrorCoefficient
Included observations: 20
Method: Least Squares
Dependent Variable: Y
模型1:Y C X T
975.0
)432.5()680.8()602.7(
195.23647.0117.883ˆ
2
R
t
tXY
2
2011/6/7 7
模型2: Y C X
0.000000Prob(F-statistic)0.595073Durbin-Watson stat
276.0832F-statistic-88.96849Log likelihood
9.196423Schwarz criterion8558.709Sum squared resid
9.096849Akaike info criterion21.80559S.E. of regression
85.78792S.D. dependent var0.935392Adjusted R-squared
253.0800Mean dependent var0.938793R-squared
0.000016.615750.0147590.245231X
0.0000-8.33449531.32660-261.0914C
Prob.t-StatisticStd. ErrorCoefficientVariable
Included observations: 20
Dependent Variable: Y
DL=1.100;DU=1.537
935.0
)616.16()334.8(
245.009.261ˆ
2
R
t
XY
2011/6/7 8
相关变量的遗漏(omitting relevant variables)及
后果
ttXYModel 210:1
tXYModel 10:2
1100 )ˆ(;)ˆ(.1 EE参数有偏性:
参数非一致性.2
参数估计量方差错误.3
错误随机误差项方差估计量.4
2011/6/7 9
变量遗漏偏差(omitting variables bias):解释
tXXYModel 22110:1
tXYModel 110:2
2
1
1
1ˆ
i
ii
x
yx回归OLS
iiii xxy 2211
2011/6/7 10
2
1
1
1
21
21
2
1
1
2
1
21
21
2
1
22111
2
1
1
1
)(
),(
)(
ˆ
i
ii
i
ii
i
ii
i
iiii
i
ii
x
x
XVar
XXCov
x
x
x
xx
x
xxx
x
yx
参数估计量的有偏性/非一致性:
2
1
1
1ˆ
i
ii
x
yx
代入
iiii xxy 2211
2011/6/7 11
参数估计量方差错误:说明
由 Y=0+ 1X1+得 21
2
1 )ˆ(
ix
Var
由 Y=0+1X1+2X2+得
)1()()
ˆ( 22
1
2
2
21
2
2
2
1
2
22
1
21xxiiiii
i
rxxxxx
x
Var
2011/6/7 12
随机误差项方差估计量的有偏性:说明
22
2
22
110
2
)ˆˆ()ˆˆ)(ˆ(2ˆ)2(
)ˆˆˆ(
)ˆ()ˆˆ(
ˆ)2(
iiiiii
iiii
iiii
YYYYYYn
YYYY
YYXY
RSS
n
3
2011/6/7 13
比较
984.0
)177.34)(929.24(
017.082.26ˆ
2
R
t
Xt
57.13ˆ
975.0
)432.5()680.8()602.7(
)270.4()075.0()17.116(
195.23647.0117.883ˆ
2
R
t
std
tXY
81.21ˆ
935.0
)616.16()334.8(
)015.0()327.31(
245.009.261ˆ
2
R
t
std
XY
2011/6/7 14
过度拟合:包括不相关变量
无关变量误差(irrevelant variables bias)
设定模型时,多选了非必须解释变量
设正确模型 Y=0+ 1X1+ (1)
但却估计了 Y=0+ 1X1+2X2+ (2)
X2:无关变量
2011/6/7 15
Y=0+1X1+2X2+
CLRM假定下:OLS估计量无偏且一致
E(0)=0; E(1)=1; E(2) =0;
正确地估计了2 /F检验有效
但的估计量非有效:方差比真实的要大
导致推断的精度下降
2011/6/7 16
过度拟合:OLS估计量不具有方差最小性
Y=0+ 1X1+
21
2
1 )ˆ(
ix
Var
Y=0+1X1+2X2+
)1()ˆ( 221
2
1
21xxi
rx
Var
2011/6/7 17
模型设定偏误的后果
增加共线性可能
扩大,易接受H0置信区间
下降假设检验过程有效性
模型误差项方差2
变大参数估计量标准差
参数一致性
参数无偏性
包括无关变量
(过度拟合)
略去相关变量
(过低拟合)
2011/6/7 18
例7-2:成本Y-产出X数据
10420
9350
8297
7274
6260
5257
4244
3240
2226
1193
XY($)
4
160
200
240
280
320
360
400
440
0 2 4 6 8 10 12
X
Y
0.000000Prob(F-statistic)
2.700212Durbin-Watson stat1202.220F-statistic
5.372956Hannan-Quinn criter.-23.52865Log likelihood
5.626764Schwarz criterion64.74382Sum squared resid
5.505730Akaike info criterion3.284911S.E. of regression
65.81363S.D. dependent var0.997509Adjusted R-squared
276.1000Mean dependent var0.998339R-squared
0.000015.896770.0591060.939588X^3
0.0000-13.150050.985665-12.96154X^2
0.000013.283724.77860763.47766X
0.000022.236786.375322141.7667C
Prob.t-StatisticStd. ErrorCoefficient
Included observations: 10
Dependent Variable: Y
0.000098Prob(F-statistic)
1.038487Durbin-Watson stat45.37496F-statistic
8.970088Hannan-Quinn criter.-42.34834Log likelihood
9.160444Schwarz criterion2791.617Sum squared resid
9.069668Akaike info criterion19.97004S.E. of regression
65.81363S.D. dependent var0.907928Adjusted R-squared
276.1000Mean dependent var0.928389R-squared
0.02222.9245340.8690842.541667X^2
0.4403-0.8180859.809494-8.025000X
0.00009.46803723.48780222.3833C
Prob.t-StatisticStd. ErrorCoefficient
Included observations: 10
Dependent Variable: Y
0.000000Prob(F-statistic)
2.935953Durbin-Watson stat788.2650F-statistic
5.491928Hannan-Quinn criter.-23.28948Log likelihood
5.809188Schwarz criterion61.71970Sum squared resid
5.657895Akaike info criterion3.513394S.E. of regression
65.81363S.D. dependent var0.997150Adjusted R-squared
276.1000Mean dependent var0.998417R-squared
0.64160.4949630.0273740.013549X^4
0.33791.0594230.6055290.641511X^3
0.0632-2.3789424.531492-10.78016X^2
0.00704.39592813.0839557.51612X
0.000112.6130111.60839146.4167C
Prob.t-StatisticStd. ErrorCoefficient
Included observations: 10
Dependent Variable: Y
2011/6/7 23
比较
0.64160.4949630.0273740.013549X^4
0.33791.0594230.6055290.641511X^3
0.0632-2.3789424.531492-10.78016X^2
0.00704.39592813.0839557.51612X
0.000112.6130111.60839146.4167C
Prob.t-StatisticStd. ErrorCoefficient
Dependent Variable: Y
0.000015.896770.0591060.939588X^3
0.0000-13.150050.985665-12.96154X^2
0.000013.283724.77860763.47766X
0.000022.236786.375322141.7667C
Prob.t-StatisticStd. ErrorCoefficient
2011/6/7 24
错误的函数形式
错误函数形式偏误wrong functional form bias
eXAXY 21 21
vXXY 22110
5
2011/6/7 25
例7-3:U.S进口商品支出(Y)与个人可支配收入(X)
2686.300439.000019872066.600259.40001977
2640.900412.300019862001.000229.90001976
2542.800367.900019851931.700187.90001975
2469.800351.100019841896.900211.80001974
2331.900282.200019831916.300218.20001973
2261.500249.500019821797.400190.70001972
2248.600258.700019811728.400166.20001971
2214.300253.600019801668.100150.90001970
2212.600277.900019791599.800144.60001969
2167.400274.100019781551.300135.70001968
XYobsXYobs
0.000000Prob(F-statistic)
1.363316Durbin-Watson stat371.3761F-statistic
8.219718Hannan-Quinn criter.-78.90561Log likelihood
8.339921Schwarz criterion3128.836Sum squared resid
8.190561Akaike info criterion13.56647S.E. of regression
85.78792S.D. dependent var0.974992Adjusted R-squared
253.0800Mean dependent var0.977624R-squared
0.0000-5.4316014.270415-23.19519@TREND
0.00008.6803110.0745360.647000X
0.0000-7.602172116.1664-883.1170C
Prob.t-StatisticStd. ErrorCoefficient
Included observations: 20
Sample: 1968 1987
Method: Least Squares
Dependent Variable: Y
0.000000Prob(F-statistic)
1.290975Durbin-Watson stat349.8538F-statistic
-2.810411Hannan-Quinn criter.31.39568Log likelihood
-2.690208Schwarz criterion0.050706Sum squared resid
-2.839568Akaike info criterion0.054614S.E. of regression
0.335428S.D. dependent var0.973490Adjusted R-squared
5.480365Mean dependent var0.976280R-squared
0.0058-3.1540260.016684-0.052622@TREND
0.00006.4623280.6031153.897526LOG(X)
0.0001-5.3461494.447935-23.77933C
Prob.t-StatisticStd. ErrorCoefficient
Included observations: 20
Sample: 1968 1987
Method: Least Squares
Dependent Variable: LOG(Y)
2011/6/7 28
度量误差:影响
误差项与自变量相关被误差项吸收
有偏无偏OLS估计量方差
有偏、不一致无偏OLS估计量
自变量度量误差因变量度量误差
2011/6/7 29
3.模型设定偏误的检验
检验是否含有无关变量
检验是否有相关变量的遗漏
检验是否有函数形式设定偏误
2011/6/7 30
是否存在无关变量:检验
基本思想
如果模型中误选了无关变量,则其系数的真值
应为零
对无关变量系数的显著性进行检验
t检验:检验某个变量是否应包括在模型中
F检验:检验多个变量(受限回归)
慎用逐步回归(stepwise regression)
3322110 XXXY
6
32260.438546.9212341125483827.9504541996
32334.536118222671100603593.7466621995
32690.333802313831095443317.9445101994
33258.231817231331105093151.9456491993
340373030825894.71105602930.2442661992
34186.329389278141123142805.1435291991
33330.42870817819.31134662590.3446241990
32440.52806724448.71122052357.1407551989
31455.72657523944.71101232141.5394081988
308702483620393.71112681999.3402981987
3046722950236561109331930.6391511986
30351.52091322705.31088451775.8379111985
3168519497152641128841739.8407311984
31645.11802216209.31140471659.8387281983
X5X4X3X2X1YYear
劳动力农机成灾面积播种面积化肥粮食总产量年份
例7-4:无关变量检验
2011/6/7 32
参数估计及检验
tXX
X
XX
X
XX
XX
XY
32
5
4
32
4
3
322
32
1
10
lnln
)ln(lnln
X5X4X3X2X1YYear
劳动力农机成灾面积播种面积化肥粮食总产量年份
0.000000Prob(F-statistic)2.481731Durbin-Watson stat
88.53554F-statistic41.30481Log likelihood
-4.958167Schwarz criterion0.002244Sum squared resid
-5.186402Akaike info criterion0.015790S.E. of regression
0.083455S.D. dependent var0.964201Adjusted R-squared
10.65680Mean dependent var0.975216R-squared
0.7558-0.3206870.163558-0.052451LOG(X5/(X2-X3))
0.2956-1.1103860.093783-0.104135LOG(X4/(X2-X3))
0.00403.8382760.1834740.704222LOG(X2-X3)
0.00075.0910460.0797760.406146LOG(X1/(X2-X3))
0.09001.8993312.0600723.912759C
Prob.t-StatisticStd. ErrorCoefficientVariable
Included observations: 14
Method: Least Squares
Dependent Variable: LOG(Y)
2011/6/7 34
相关变量的遗漏/函数形式设定偏误:检验
模型判定的准则
1. R2、校正后的R2、AIC准则、SC准则
2. 与预期相比,估计系数的符号
3. 杜宾-瓦尔森d统计量
4. 预测误差
诊断方法
检验1:残差图示法
检验2:回归偏误设定检验
2011/6/7 35
预测误差
相对误差绝对植平均
Mean Absolute
Percentage Error
绝对误差平均 Mean
Absolute Error
误差均方根Root Mean
Squared error hT Tt tt yyh 1 2)ˆ(1
hT Tt tt yyh 1 |ˆ|1
hT Tt
t
tt
y
yy
h 1
|
ˆ
|1
2011/6/7 36
Theil不等系数(Theil inequality coefficient)
nT
Tt t
nT
Tt t
nT
Tt tt
y
n
y
n
yy
n
1
2
1
2
1
2
1ˆ1
)ˆ(1
7
2011/6/7 37
预测误差
Forecast: YF
Actual: Y
Forecast sample: 1 10
Included observations: 10
Root Mean Squared Error 16.70813
Mean Absolute Error 15.25333
Mean Abs. Percent Error 5.784881
Theil Inequality Coefficient
0.029538
Bias Proportion 0.000000
Variance Proportion 0.018574
Covariance Proportion 0.981426
Forecast: YF
Actual: Y
Forecast sample: 1 10
Included observations: 10
Root Mean Squared Error 2.544481
Mean Absolute Error 1.936410
Mean Abs. Percent Error 0.733252
Theil Inequality Coefficient
0.004495
Bias Proportion 0.000000
Variance Proportion 0.000416
Covariance Proportion 0.999584
3
3
2
210 XXXY 2210 XXY
2011/6/7 38
检验1:残差图示法
(a)趋势变化 :模型
设定时可能遗漏了—
—随着时间的推移而
持续上升的变量
(b)循环变化:模型
设定时可能遗漏了—
—随着时间的推移而
呈现循环变化的变量
2011/6/7 39
•一元回归模型中,真实模型呈幂函数形式,但
却选取了线性函数进行回归。
2011/6/7 40
-.12
-.08
-.04
.00
.04
.08
.12
68 70 72 74 76 78 80 82 84 86
LOG(Y) Residuals
-.15
-.10
-.05
.00
.05
.10
68 70 72 74 76 78 80 82 84 86
LOG(Y) Residuals
例7-3:比较残差图
左图:LOG(Y) = -23.78 + 3.9*LOG(X) - 0.053*@TREND
右图:LOG(Y) = -9.87 + 2.01*LOG(X)
2011/6/7 41
线性模型与对数线性模型:如何选择?
MWD检验(1983)
MacKinnon, James G. & White, Halbert &
Davidson, Russell, 1983
H0:Y是X的线性函数
H1:lnY是X或lnX的线性函数
2011/6/7 42
例7-3:U.S进口商品支出(Y)与个人可支配收入(X)
2686.300439.000019872066.600259.40001977
2640.900412.300019862001.000229.90001976
2542.800367.900019851931.700187.90001975
2469.800351.100019841896.900211.80001974
2331.900282.200019831916.300218.20001973
2261.500249.500019821797.400190.70001972
2248.600258.700019811728.400166.20001971
2214.300253.600019801668.100150.90001970
2212.600277.900019791599.800144.60001969
2167.400274.100019781551.300135.70001968
XYobsXYobs
8
2011/6/7 43
MWD检验:思路步骤
iii
ii
i
YYZ
YY
Y
lnˆln.3
lnln.2
;ˆ.1
1 计算对数差
的估计值到估计对数线性模型,得
估计线性模型,得到
01
1
HZ
ZXYOLS
则拒绝的系数是统计显著的,如果
;和对回归:
iii YYZ ˆ)exp(ln.4 2 计算差
12
2lnln
HZ
ZXXYOLS
绝的系数统计显著,则拒如果
;和或对回归:
2011/6/7 44
MWD检验及其说明
0.1741-1.4184770.002392-0.003392Z2
0.000020.499260.1011142.072771LOG(X)
0.0000-13.409100.771390-10.34365C
Prob.t-StatisticStd. ErrorCoefficientVariable
Dependent Variable: LOG(Y)
0.0204-2.556845126.9500-324.5915Z1
0.000017.655100.0149960.264753X
0.0000-9.49339731.90918-302.9265C
Prob.t-StatisticStd. ErrorCoefficientVariable
Dependent Variable: Y
2011/6/7 45
回归偏误设定检验: RESET检验
RESET检验(regression error specification test)
拉姆齐(Ramsey)于1969年提出
基本思想:
如果事先知道遗漏了哪个变量,只需将此变量
引入模型,估计并检验其参数是否显著不为零
即可;
未知遗漏了哪个变量,寻找一个替代变量
(proxy) Z,来进行上述检验
RESET检验中,采用所设定模型中被解释变量
Y的估计值Ŷ的若干次幂来充当该“替代”变量
2011/6/7 46
RESET检验:变量遗漏
tXYOLS 10.1 估计:先
322110 ˆˆ
ˆ.2
YYXY
Ye
比如:
定引入的阶数的图形表现的关系,决与通过残差
)1,(~
)1/(
)/()(
URU
UU
RUUR knkkF
knRSS
kkRSSRSS
F
利用F检验/t检验来判断是否增加 “替代”变量
H0:增加的变量系数为0
2011/6/7 47
例7-5:中国商品进口与国内生产总值
2436.195933.32001591.416909.21989
2250.989442.22000552.714928.31988
165782067.41999432.111962.51987
1402.478345.21998429.110202.21986
1423.774462.61997422.58964.41985
1388.367884.61996274.171711984
1320.858478.11995213.95934.51983
1156.146759.41994192.95294.71982
1039.634634.41993220.24862.41981
805.926638.11992200.24517.81980
533.521617.81991156.74038.21979
637.918547.91990108.93624.11978
商品进口MGDP年份商品进口MGDP年份
0.000000Prob(F-statistic)0.691459Durbin-Watson stat
394.3860F-statistic-154.3396Log likelihood
13.12647Schwarz criterion541345.2Sum squared resid
13.02830Akaike info criterion156.8649S.E. of regression
667.4365S.D. dependent var0.944763Adjusted R-squared
826.9542Mean dependent var0.947164R-squared
0.000019.859150.0010260.020381GDP
0.00343.28779846.64491153.3591C
Prob.t-StatisticStd. ErrorCoefficientVariable
Included observations: 24
Dependent Variable: M
=0.05 DL=1.273 DU=1.446
9
2011/6/7 49
残差与预测值:图示
-400
-300
-200
-100
0
100
200
300
400
0 400 800 1,200 1,600 2,000 2,400
MF
R
E
0.000000Prob(F-statistic)1.445272Durbin-Watson stat
380.9182F-statistic-140.8727Log likelihood
12.26907Schwarz criterion176234.0Sum squared resid
12.07272Akaike info criterion93.87065S.E. of regression
667.4365S.D. dependent var0.980219Adjusted R-squared
826.9542Mean dependent var0.982799R-squared
0.00006.3692521.35E-078.57E-07FITTED^3
0.0000-6.1360330.000455-0.002794FITTED^2
0.00007.8601670.0090550.071175GDP
0.9883-0.01482547.51244-0.704383C
Prob.t-StatisticStd. ErrorCoefficientVariable
Included observations: 24
Dependent Variable: M
Test Equation:
0.000001Probability26.93387Log likelihood ratio
0.000013Probability20.71742F-statistic
Ramsey RESET Test:
=0.05 DL=1.1011 DU=1.656
0.000000Prob(F-statistic)1.973415Durbin-Watson stat
356.8875F-statistic-137.6365Log likelihood
12.13180Schwarz criterion134576.1Sum squared resid
11.88637Akaike info criterion84.16027S.E. of regression
667.4365S.D. dependent var0.984100Adjusted R-squared
826.9542Mean dependent var0.986865R-squared
0.02542.4251662.61E-106.34E-10FITTED^4
0.1063-1.6954251.19E-06-2.03E-06FITTED^3
0.41450.8342940.0018260.001524FITTED^2
0.30751.0485860.0215950.022645GDP
0.57820.56573343.8788724.82371C
Prob.t-StatisticStd. Error
Coefficien
tVariable
Included observations: 24
Dependent Variable: M
Test Equation:
0.000000Probability33.40626Log likelihood ratio
0.000006Probability19.14311F-statistic
Ramsey RESET Test:
2011/6/7 52
RESET检验:用于诊断
简单易行:无需设定备择模型
但:无助于选择正确模型
1
ECONOMETRICS
第8讲 多重共线性
2011/6/7 2
多元线性回归模型:若干假定
I. 解释变
量
II. 随机误
差项
III. 模型
1. 非随机(与扰动项不相关)
2. 无多重共线性:解释变量不完全线性相关
3. 零期望:E(i)=0
4. 同方差:var (i)= E(i2)=2
5. 序列不相关:cov(i, j)= E(i j)=0
6. 正态性: i~N(0, 2)
7. 参数线性
8. 设定正确
ikikiit XXXY 22110
2011/6/7 3
古典线性回归模型(CLRM):违背情况
1. 多重共线性:解释变量问题
2. 异方差:误差项问题
3. 序列相关:误差项问题
2011/6/7 4
第8讲 多重共线性
1. 多重共线性:含义
2. 多重共线性:后果
3. 多重共线性:诊断
4. 多重共线性:克服
2011/6/7 5
1. 共线性
完全共线性
近似共线性
2011/6/7 6
多重共线性:概念
模型
Yi=0+1X1i+2X2i++kXki+i i=1,2,…,n
基本假设之一:X之间相互独立
多重共线性(Multicollinearity)
如果两个或多个解释变量之间:线性相关
2
2011/6/7 7
共线性(multi-collinearity)
完全共线性(perfect multicollinearity)
如果存在ci不全为0,使得
c1X1i+c2X2i+…+ckXki=0 i=1, 2, …, n
近似共线性(approximate multicollinearity)
如果c1X1i+c2X2i+…+ckXki +vi = 0 i=1,2,…,n
其中ci不全为0,vi为随机变量
2011/6/7 8
例8-1:对某种商品的需求
278.82801029
281.1282930
284.6284833
285.8286734
289.7288637
290.2290538
292.8292439
293.5294344
294.9296245
297.5298149
X3
每周收益/美元
X2
每周收入/美元
X1
价格/美元
Y
数量
2011/6/7 9
相关系数矩阵
10.98844-0.98844x3
1-1x2
1x1
x3x2x1
2011/6/7 10
模型1
模型1不能进行回归:
X1、 X2存在完全共线性:参数估计量不存在
舍去X2进行回归
iiii XXY 22110)1
0.000000Prob(F-statistic)2.051315Durbin-Watson stat
321.6650F-statistic-13.95996Log likelihood
3.252509Schwarz criterion9.551515Sum squared resid
3.191992Akaike info criterion1.092675S.E. of regression
6.613118S.D. dependent var0.972700Adjusted R-squared
37.80000Mean dependent var0.975733R-squared
0.000066.538110.74643949.66667C
0.0000-17.935020.120300-2.157576X1
Prob.t-StatisticStd. ErrorCoefficientVariable
Included observations: 10
Sample: 1 10
Method: Least Squares
Dependent Variable: Y
2011/6/7 12
模型2
模型2进行回归
不完全多重共线性
尽管价格和收益并不完全线性相关,但两个变
量之间却存在高度的依存关系
iiii XXY 33110)2
3
0.000002Prob (F-statistic)2.560899Durbin-Watson stat
153.8192F-statistic-13.52557Log likelihood
3.395889Schwarz criterion8.756717Sum squared resid
3.305113Akaike info criterion1.118463S.E. of regression
6.613118S.D. dependent var0.971396Adjusted R-squared
37.80000Mean dependent var0.977752R-squared
0.26531.210747120.0622145.3650C
0.4516-0.7970890.400306-0.319080收益X3
0.0108-3.4443830.812185-2.797475价格X1
Prob.t-StatisticStd. ErrorCoefficientVariable
Sample: 1 10
Method: Least Squares
Dependent Variable: Y需求
0.000066.538110.74643949.66667C
0.0000-17.935020.120300-2.157576价格X1
Prob.t-StatisticStd. ErrorCoefficientVariable
0.26531.210747120.0622145.3650C
0.4516-0.7970890.400306-0.319080收益X3
0.0108-3.4443830.812185-2.797475价格X1
Prob.t-StatisticStd. ErrorCoefficientVariable
标准差变大
1.092675S.E. of regression
1.118463S.E. of regression
0.972700Adjusted R-squared
0.975733R-squared
0.971396Adjusted R-squared
0.977752R-squared 符号有误
2011/6/7 15
近似多重共线性:特性
OLS估计量:BLUE
程度问题,而非存在与否的问题
在解释变量是非随机的假设条件下出现
样本特性(sample specific)
非总体特征
无需:假设检验总体的共线性
2011/6/7 16
近似多重共线性:现实原因
经济变量相关的共同趋势
时间序列样本:简单线性模型
横截面数据
滞后变量的引入
例如,消费=f(当期收入, 前期收入)
样本资料的限制
2011/6/7 17
2.近似多重共线性:实际后果
OLS估计量方差/标准差变大:
置信区间变宽/估计精度下降/预测能力下降
变量的显著性检验失效
R2值较高,但t值则并不都显著
难以衡量各个解释变量对回归平方和(ESS)或
者R2的贡献
参数估计量经济含义不合理
回归系数符号可能有误
2011/6/7 18
近似共线性:OLS估计量方差
2
2
21
1
1
1)ˆvar(
i
xr
部分共线性
无线性相关
0;
1
1
0;
)ˆvar(
2
2
2
2
2
2
2
2
2
1
11
1
r
xxr
r
x
ii
i
Y= 0+1X1+2X2+
的相关系数21: XXr
4
2011/6/7 19
方差膨胀因子Variance Inflation Factor, VIF
21
1
r
VIF
10000.999
1
1000.99
500.98
33.330.97
250.96
200.95
100.9
50.8
20.5
10
VIFr 2 多重共线性使参数估计值
的方差增大
2
2
1
1
)ˆvar(
i
x
VIF
存在近似多重共线性
参数估计值:方差与标准差变大
参数t检验更易不能通过,易接受H0
可能将重要的X剔出模型
2011/6/7 21
注意
多重共线性只是样本特性
除非完全共线,其并不违背CLRM假设;
OLS估计量仍BLUE
多重共线性未必必然不好
目的:系数估计/预测?提高拟合优度?
2011/6/7 22
3.多重共线性:诊断
测度样本多重共线性的程度
综合统计判断法
简单相关系数法
存在与否
逐步回归法
判定系数法
存在范围
2011/6/7 23
判断样本多重共线性是否存在
简单相关系数法:两个解释变量的模型
求X1与X2的简单相关系数r
若|r|接近1,则说明存在较强的多重共线性
综合统计判断法:多个解释变量的模型
若 在OLS法下:R2与F值显著,但t检验不都显
著
2011/6/7 24
多重共线性范围诊断1:判定系数法
辅助回归
Xji=1X1i+2X2i+LXLi
判定系数Rj•2较大,则Xj与其他X存在共线性
)1,2(~
)1/()1(
)2/(
2
.
2
.
knkF
knR
kR
F
j
j
j
5
2011/6/7 25
等价的检验
在模型中排除某一个解释变量Xj,估计模型;
如果拟合优度与包含Xj时十分接近,则说明Xj与
其它解释变量之间存在共线性
2011/6/7 26
多重共线性范围诊断2:逐步回归法
逐个引入解释变量,OLS回归;
如果拟合优度变化显著,则说明新引入的变量
是独立解释变量;
否则,新引入的变量与其它变量之间存在共线
性
2011/6/7 27
4、共线性的处理
I. 从模型中删掉不重要的解释变量
II. 获取额外的数据或者新的样本
III. 重新考虑模型
IV. 变量变换
V. 其他补救措施
2011/6/7 28
排除引起共线