the multiple regression model. two explanatory variables y t = 1 + 2 x t2 + 3 x t3 + ε t ytyt x...

Post on 14-Dec-2015

228 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Multiple Regression Model

Two Explanatory Variables

yt = 1 + 2xt2 + 3xt3 + εt

yt

xt2

= 2 xt3

yt = 3xt affect yt

separately

But least squares estimation of 2

now depends upon both xt2 and xt3 .

Correlated Variables

yt = output xt2 = capital xt3 = labor

Always 5 workers per machine.

If number of workers per machine is never varied, it becomes impossible to tell if the machines or the workers are responsible for changes in output.

yt = 1 + 2xt2 + 3xt3 + εt

The General Model

yt = 1 + 2xt2 + 3xt3 +. . .+ KxtK + εt

The parameter 1 is the intercept (constant) term.

The variable attached to 1 is xt1= 1.

Usually, the number of explanatory variables is said to be K1 (ignoring xt1= 1), while the

number of parameters is K. (Namely: 1 . . . K).

Statistical Properties of εt

1. E(εt) = 0

2. var(εt) = 2

covεt , εs= for t s

4. εt ~ N(0, 2)

Statistical Properties of yt

1. E (yt) = 1 + 2xt2 +. . .+ KxtK

2. var(yt) = var(εt) = 2

cov(yt ,ys) = cov(εt , εs)= 0 t s

4. yt ~ N(1+2xt2 +. . .+KxtK, 2)

Assumptions

1. yt = 1 + 2xt2 +. . .+ KxtK + εt

2. E (yt) = 1 + 2xt2 +. . .+ KxtK

3. var(yt) = var(εt) = 2

cov(yt ,ys) = cov(εt , εs) = 0 t s

5. The values of xtk are not random

6. yt ~ N(1+2xt2 +. . .+KxtK, 2)

Least Squares Estimation

yt = 1 + 2xt2 + 3xt3 + εt

S S(1, 2, 3) = yt12xt23xt3

Define: yt = yt y*

xt2 = xt2 x2*

xt3 = xt3 x3*

T

t=1

b3 =yt xt3xt2 yt xt2xt3xt2* * * * * * *2

xt2 xt3 xt2xt3* * * *2 2 2

b2 =yt xt2xt3 yt xt3xt2xt3* * * * * * *2

xt2 xt3 xt2xt3* * * *2 2 2

Least Squares Estimators

b1 = y – b2x2 – b3x3

Dangers of Extrapolation

Statistical models generally are good onlywithin the relevant range. This meansthat extending them to extreme data valuesoutside the range of the original data oftenleads to poor and sometimes ridiculous results.

If height is normally distributed and the normal ranges from minus infinity to plus infinity, pity the man minus three feet tall.

Interpretation of Coefficientsbj represents an estimate of the mean cha

nge in y responding to a one-unit change in xj when all other independent variables are held constant. Hence, bj is called the partial coefficient.

Note that regression analysis cannot be interpreted as a procedure for establishing a cause-and-effect relationship between variables.

Universal Set

B

x2 x3x2 / x3

x3 / x2

Error Variance Estimation

2^ =

εt^

Unbiased estimator of the error variance:

2

2

^

Transform to a chi-square distribution:

Gauss-Markov Theorem

Under the first of five assumptions of the multiple regression model, the ordinary least squares estimators have the smallest variance of all linear and unbiased estimators. This means that the least squares estimators are the Best Linear U nbiased Estimators (BLUE).

Variances

yt = 1 + 2xt2 + 3xt3 + εt

var(b3) =

(1 r23)(xt3 x3)2

22

var(b2) =(1 r23)(xt2 x2)

22

2

(xt2 x2)2 (xt3 x3)

2

where r23 = (xt2 x2)(xt3 x3)

When r23 = 0these reduceto the simpleregressionformulas.

Variance Decomposition

The variance of an estimator is smaller when:

1. The error variance, 2, is smaller:

2 0 .

2. The sample size, T, is larger:

(xt2 x2)2 .

3. The variable values are more spread out: (xt2 x2)

2 .

4. The correlation is close to zero: r23 0 .

2

t = 1

T

Covariances

yt = 1 + 2xt2 + 3xt3 + εt

where r23 =

(xt2 x2)2 (xt3 x3)

2

(xt2 x2)(xt3 x3)

(1 r23) (xt2 x2)2 (xt3 x3)

2

cov(b2,b3) = 2

r23 2

Covariance Decomposition

1. The error variance, 2, is larger.

2. The sample size, T, is smaller.

3. The values of the variables are less spread out.

4. The correlation, r23, is high.

The covariance between any two estimatorsis larger in absolute value when:

Var-Cov Matrix

yt = 1 + 2xt2 + 3xt3 + εt

var(b1) cov(b1,b2) cov(b1,b3)cov(b1,b2,b3) = cov(b1,b2) var(b2) cov(b2,b3) cov(b1,b3) cov(b2,b3) var(b3)

The least squares estimators b1, b2, and b3

have covariance matrix:

Normal

yt = 1 + 2x2t + 3x3t +. . .+ KxKt + εt

yt ~N (1 + 2x2t + 3x3t +. . .+ KxKt), 2

εt ~ N(0, 2)This implies and is implied by:

bk ~ N k, var(bk)

z = ~ N(0,1) for k = 1,2,...,Kbk k

var(bk)

Since bk is a linear

function of the yt:

Student-t

bk k

var(bk)^t = =

bk k

se(bk)

Since generally the population varianceof bk , var(bk) , is unknown, we estimate it with which uses

2 instead of 2.var(bk)^ ^

t has a Student-t distribution with df=(TK).

Interval Estimation

bk k

se(bk)P tc < < tc = 1

tc is critical value for (T-K) degrees of freedom

such that P( t > tc ) = /2.

P bk tc se(bk) < k < bk + tc se(bk) = 1

Interval endpoints:bk tc se(bk) , bk + tc se(bk)

Student - t Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

Student-t tests can be used to test any linearcombination of the regression coefficients:

H0: 2 + 3 + 4 = 1H0: 1 = 0

H0: 32 73 = 21 H0: 2 3 < 5

Every such t-test has exactly TK degrees of freedom where K = # of coefficients estimated(including the intercept).

One Tail Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

H0: 3 < 0

H1: 3 > 0b3

se(b3)t = ~ t (TK)

tc0

df = TK = T4

Two Tail Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

b2

se(b2)t = ~ t (TK)

tc0

df = TK = T4

-tc

H0: 2 = 0

H1: 2 0

Goodness - of - Fit

Coefficient of Determination

SSTR2 = = (yt y)2t = 1

T^

SSR

(yt y)2t = 1

T

0 < R2 < 1

Adjusted R-Squared

Adjusted Coefficient of Determination

Original:

Adjusted:

SST/(T1)R2 = 1 SSE/(TK)

SST= 1 SSER2 =SSTSSR

Computer Output

Table 8.2 Summary of Least Squares Results Variable Coefficient Std Error t-value p-value constant 104.79 6.48 16.17 0.000price 6.642 3.191 2.081 0.042advertising 2.984 0.167 17.868 0.000

b2

se(b2)t = =

6.642

3.1912.081=

Reporting Your Results

yt = Xt2 + Xt3^

(6.48) (3.191) (0.167) (s.e.)

yt = Xt2 + Xt3^

(16.17) (-2.081) (17.868) (t)

Reporting t-statistics:

Reporting standard errors:

H0: 2 = 0H1: 2 = 0

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

H0: yt = 1 + 3Xt3 + 4Xt4 + εt

H1: yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

H0: Restricted Model

H1: Unrestricted Model

Single Restriction F-Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

dfd = TK = 49dfn = J = 1

(SSER SSEU)/J

SSEU/(TK)F =

(1964.758 1805.168)/1

1805.168/(52 3)=

= 4.33

By definition this is the t-statistic squared:t = 2.081 F = t2 =

H0: 2 = 0

H1: 2 = 0 ~ FJ, T-K

Under H0

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

H0: yt = 1 + 3Xt3 + εt

H1: yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

H0: Restricted Model

H1: Unrestricted Model

H0: 2 = 0, 4 = 0

H1: H0 not true

Multiple Restriction F-Test

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + εt

H0: 2 = 0, 4 = 0

H1: H0 not true

dfd = TK = 49

dfn = J = 2(J: The number of hypothesis)

(SSER SSEU)/J

SSEU/(TK)F =

First run the restrictedregression by dropping Xt2 and Xt4 to get SSER.Next run unrestricted regression to get SSEU .

~ F J, T-K

Under H0

F-Tests

(SSER SSEU)/J

SSEU/(TK)F =

F-Tests of this type are always right-tailed, even for left-sided or two-sided hypotheses, because any deviation from the null will make the F value bigger (move rightward).

0 Fc

f(F)

F

~ F J, T-K

F-Test of Entire Equation

yt = 1 + 2Xt2 + 3Xt3 + εt

H0: 2 = 3 = 0

H1: H0 not true

dfd = TK = 49dfn = J = 2

(SSER SSEU)/J

SSEU/(TK)F =

(13581.35 1805.168)/2

1805.168/(52 3)=

= 159.828

We ignore 1. Why?

F2, 49, 0.005 = 3.187

= 0.05

Reject H0!

ANOVA Table

Table 8.3 Analysis of Variance Table Sum of MeanSource DF Squares Square F-Value Regression 2 11776.18 5888.09 159.828Error 49 1805.168 36.84Total 51 13581.35 p-value: 0.0001

SSTR2 = =SSR = 0.86711776.18

13581.35

84.36ˆ 2 MSE

Nonsample Information

ln(yt) = 1 + 2 ln(Xt2) + 3 ln(Xt3) + 4 ln(Xt4) + εt

A certain production process is known to beCobb-Douglas with constant returns to scale.

2 + 3 + 4 = 1where 4 = (1 2 3)

ln(yt /Xt4) = 1 + 2 ln(Xt2/Xt4) + 3 ln(Xt3 /Xt4) + εt

yt = 1 + 2 Xt2 + 3 Xt3 + εt* * *

Run least squares on the transformed model.Interpret coefficients same as in original model.

Collinear Variables

The term independent variables means an explanatory variable is independent of of the error term, but not necessarily independent of other explanatory variables.

Since economists typically have no controlover the implicit experimental design,explanatory variables tend to movetogether which often makes sorting outtheir separate influences rather problematic.

Effects of Collinearity

1. no least squares output when collinearity is exact.2. large standard errors and wide confidence intervals. 3. insignificant t-values even with high R2 and a signi

ficant F-value.4. estimates sensitive to deletion or addition of a few

observations or insignificant variables.5.The OLS estimators retain all their desired propertie

s (BLUE and consistency), but the problem is that the influential procedure may be uninformative.

A high degree of collinearity will produce:

Identifying Collinearity

Evidence of high collinearity include:

1. a high pairwise correlation between two explanatory variables (greater than .8 or .9).

2. a high R-squared (called Rj2) when regressing

one explanatory variable (Xj) on the other explanatory variables. Variance inflation factor (VIF): VIF (bj) = 1 / (1 Rj

2) ( > 10)

3. high R2 and a statistically significant F-value when the t-values are statistically insignificant.

Mitigating CollinearityHigh collinearity is not a violation ofany least squares assumption, but rather a lack of adequate information in the sample:

1. Collect more data with better information.2. Impose economic restrictions as

appropriate.3. Impose statistical restrictions when

justified.4. Delete the variable which is highly

collinear with other explanatory variables.

Prediction

Given a set of values for the explanatoryvariables, (1 X02 X03), the best linearunbiased predictor of y is given by:

yt = 1 + 2Xt2 + 3Xt3 + εt

This predictor is unbiased in the sensethat the average value of the forecasterror is zero.

y0 = b1 + b2X02 + b3X03^

top related