econometrics_ch4.ppt

66
405 ECONOMETRICS Chapter # 3: TWO-VARIABLE REGRESSION MODEL: THE PROBLEM OF ESTIMATION By Domodar N. Gujarati Prof. M. El-Sakka Prof. M. El-Sakka Dept of Economics Kuwait Dept of Economics Kuwait University University

Upload: kashif-khurshid

Post on 22-Dec-2015

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Econometrics_ch4.ppt

405 ECONOMETRICSChapter # 3: TWO-VARIABLE REGRESSION MODEL: THE PROBLEM OF ESTIMATION

By Domodar N. Gujarati

Prof. M. El-SakkaProf. M. El-Sakka

Dept of Economics Kuwait UniversityDept of Economics Kuwait University

Page 2: Econometrics_ch4.ppt

THE METHOD OF ORDINARY LEAST SQUARES

• To understand this method, we first explain the To understand this method, we first explain the least squares principleleast squares principle..

• Recall the two-variable PRF:Recall the two-variable PRF:

YYii = = ββ11 + β + β22XXii + u + uii (2.4.2)(2.4.2)

• The PRF is not directly observable. We estimate it from the SRF:The PRF is not directly observable. We estimate it from the SRF:

YYii = = βˆβˆ11 + βˆ + βˆ22XXii +uˆ +uˆii (2.6.2)(2.6.2)

= = YˆYˆii +uˆ +uˆii (2.6.3)(2.6.3)

• where where YˆYˆii is the estimated (conditional mean) value of Yis the estimated (conditional mean) value of Y ii . .

• But how is the SRF itself determined? But how is the SRF itself determined? First, express (2.6.3) asFirst, express (2.6.3) as

uˆuˆii = Y = Yii − Yˆ − Yˆi i

= = YYii − − βˆβˆ11 − βˆ − βˆ22XXii (3.1.1)(3.1.1)

• Now given Now given n pairs of observations on Y and X, we would like to determine n pairs of observations on Y and X, we would like to determine the the SRFSRF in such a manner that in such a manner that it is as close as possible to the actual it is as close as possible to the actual YY. To . To this this end, we may adopt the following criterion: end, we may adopt the following criterion:

• Choose the SRF in such a way that the sum of the residuals ˆChoose the SRF in such a way that the sum of the residuals ˆuuii = (Y = (Yii − Yˆ − Yˆii) ) is is

as small as possible.as small as possible.

Page 3: Econometrics_ch4.ppt

• But this is not a very good criterionBut this is not a very good criterion. If we adopt the criterion of . If we adopt the criterion of minimizing minimizing ˆ̂uuii , , Figure 3.1 shows that the Figure 3.1 shows that the residuals residuals ˆ̂uu22 and and ˆuˆu33 as well as as well as

the residuals the residuals ˆuˆu11 and and ˆuˆu44 receive the same weight receive the same weight in the sum in the sum (ˆ(ˆuu11 + ˆu + ˆu22 + ˆu + ˆu33

+ ˆu+ ˆu44)). A consequence of this is that it is quite possible that the algebraic . A consequence of this is that it is quite possible that the algebraic

sum of the sum of the ˆ̂uuii is small (is small (even zeroeven zero) although the ) although the ˆ̂uuii are widely scattered are widely scattered

about the SRF. about the SRF.

• To see this, let ˆuTo see this, let ˆu11, ˆu, ˆu22, ˆu, ˆu33, and ˆu, and ˆu44 in in Figure 3.1 take the values of 10, −2, Figure 3.1 take the values of 10, −2,

+2, and −10, respectively. The algebraic sum of these residuals is zero +2, and −10, respectively. The algebraic sum of these residuals is zero although although ˆ̂uu11 and and ˆuˆu44 are scattered more are scattered more widely around the SRF than widely around the SRF than ˆ̂uu22

and and ˆuˆu33. .

• We can avoid this problem if we We can avoid this problem if we adopt the adopt the least-squares criterionleast-squares criterion, which , which states that the SRF can be fixed in states that the SRF can be fixed in such a way that such a way that

ˆ̂uu22ii = (Y = (Yii − Yˆ − Yˆii))22

= (= (YYii − − βˆβˆ11 − βˆ − βˆ22XXii))22 (3.1.2)(3.1.2)

• is as small as possibleis as small as possible, where , where ˆuˆu22i i are the squared residuals.are the squared residuals.

Page 4: Econometrics_ch4.ppt
Page 5: Econometrics_ch4.ppt

• By squaring By squaring ˆuˆuii , , this method gives more weight to residuals such as this method gives more weight to residuals such as

ˆ̂uu11 and and ˆuˆu44 in Figure 3.1 in Figure 3.1 than the residuals than the residuals ˆuˆu22 and and ˆuˆu33. .

• A further justificationA further justification for the least-squares method lies in the fact for the least-squares method lies in the fact that the that the estimators obtained by it have some very desirable statistical estimators obtained by it have some very desirable statistical propertiesproperties, as we shall see shortly, as we shall see shortly. .

Page 6: Econometrics_ch4.ppt

• It is obvious from (3.1.2) that:It is obvious from (3.1.2) that:

ˆ̂uu22ii = = f (βˆf (βˆ11, βˆ, βˆ22) ) (3.1.3)(3.1.3)

• that is, the sum of the squared residuals is some function of the estimators that is, the sum of the squared residuals is some function of the estimators βˆβˆ11 and βˆ and βˆ22. To see this. To see this, consider Table 3.1 and conduct two experiments. , consider Table 3.1 and conduct two experiments.

Page 7: Econometrics_ch4.ppt

• Since theSince the βˆ βˆ values in the values in the two experiments are different, we get two experiments are different, we get different values for the estimated residualsdifferent values for the estimated residuals. .

• Now which sets of Now which sets of βˆ βˆ values should we choose? Obviously the values should we choose? Obviously the βˆ’s βˆ’s of of the the first experiment are the “best” values. But we can make endless first experiment are the “best” values. But we can make endless experiments and then choosing that set of experiments and then choosing that set of βˆβˆ values that gives us the values that gives us the least possible value of least possible value of ˆ̂uu22

ii

• But since time, and patience, are But since time, and patience, are generally in short supply, we need to generally in short supply, we need to consider some shortcuts to this trial-and-error process. consider some shortcuts to this trial-and-error process. FortunatelyFortunately, , the method of least squares provides us with unique estimates of the method of least squares provides us with unique estimates of ββ11

andand β β22 that that give the smallest possible give the smallest possible value of ˆvalue of ˆuu22ii. .

Page 8: Econometrics_ch4.ppt

ˆ̂uu22ii = (= (YYii − − βˆβˆ11 − βˆ − βˆ22XXii))22 (3.1.2)(3.1.2)

Page 9: Econometrics_ch4.ppt

• The process of differentiation yields the following equations for estimating The process of differentiation yields the following equations for estimating ββ11 and and ββ22::

YYii X Xii = = βˆβˆ11XXii + + βˆβˆ22XX22ii (3.1.4)(3.1.4)

YYii = n = nβˆβˆ11 + βˆ + βˆ22XXii (3.1.5)(3.1.5)

• where where nn is the sample size. These simultaneous equations are known as the is the sample size. These simultaneous equations are known as the normal equationsnormal equations. Solving the normal equations simultaneously, we obtain:. Solving the normal equations simultaneously, we obtain:

Page 10: Econometrics_ch4.ppt

• where where X¯ and Y¯ are the sample means of X and Y and where we X¯ and Y¯ are the sample means of X and Y and where we define define xxii = (X = (Xii − X¯ ) − X¯ ) and and yyii = (Y = (Yii − Y¯) − Y¯).. Henceforth we adopt the Henceforth we adopt the

convention of letting the convention of letting the lowercase letters denote deviations from mean lowercase letters denote deviations from mean valuesvalues. .

Page 11: Econometrics_ch4.ppt

• The last step in (3.1.7) can be obtained directly from (3.1.4) by simple The last step in (3.1.7) can be obtained directly from (3.1.4) by simple algebraic manipulations. Incidentally, note that, by making use of simple algebraic manipulations. Incidentally, note that, by making use of simple algebraic identities, formula (3.1.6) for estimating algebraic identities, formula (3.1.6) for estimating ββ22 can be alternatively can be alternatively

expressed as:expressed as:

• The estimators obtained previously are known as the The estimators obtained previously are known as the least-squares least-squares estimatorsestimators..

Page 12: Econometrics_ch4.ppt

• Note the following Note the following numerical properties of estimators numerical properties of estimators obtained by the obtained by the method of OLS: method of OLS:

• I. The OLS estimators are expressed solely in terms of the observable (i.e., I. The OLS estimators are expressed solely in terms of the observable (i.e., sample) quantities (i.e., sample) quantities (i.e., X and Y). X and Y). Therefore, they can be easily computedTherefore, they can be easily computed. .

• II. II. They are point estimatorsThey are point estimators; that is, given the sample, each estimator will ; that is, given the sample, each estimator will provide only a single (provide only a single (point, point, not intervalnot interval) value of the relevant population ) value of the relevant population parameter. parameter.

• III. Once the OLS estimates are obtained from the sample data, the sample III. Once the OLS estimates are obtained from the sample data, the sample regression line (Figure 3.1) can be easily obtainedregression line (Figure 3.1) can be easily obtained . .

• The regression line The regression line thus obtained has the following thus obtained has the following propertiesproperties::

– 1. It passes through the sample means 1. It passes through the sample means of of Y and X. This fact is obvious Y and X. This fact is obvious from (3.1.7), for the latter can be written as from (3.1.7), for the latter can be written as Y¯ = βˆY¯ = βˆ11 + βˆ + βˆ22X¯ X¯ , which is , which is

shown diagrammatically in Figure 3.2. shown diagrammatically in Figure 3.2.

Page 13: Econometrics_ch4.ppt
Page 14: Econometrics_ch4.ppt

– 2. The mean value of the estimated 2. The mean value of the estimated Y = YˆY = Yˆii is equal to the mean value of is equal to the mean value of

the actual the actual YY for: for:

YˆYˆii = = βˆβˆ11 + βˆ + βˆ22XXi i

= (= (Y¯ − βˆY¯ − βˆ22X¯ ) + βˆX¯ ) + βˆ22Xi Xi

= = Y¯ + Y¯ + βˆβˆ22((XXii − X¯) − X¯) (3.1.9)(3.1.9)

• Summing both sides of this last equality over the sample values and Summing both sides of this last equality over the sample values and dividing through by the sample size dividing through by the sample size nn gives givesY¯ˆ = Y¯ Y¯ˆ = Y¯ (3.1.10)(3.1.10)

• where use is made of the fact that where use is made of the fact that ((XXii − X¯ ) = 0 − X¯ ) = 0. .

– 3. The mean value of the residuals 3. The mean value of the residuals ¯ ¯ ˆ̂uuii is zero is zero. . From Appendix 3A, From Appendix 3A,

Section 3A.1, the first equation is:Section 3A.1, the first equation is:

−−2(2(YYii − − βˆβˆ11 − βˆ − βˆ22XXii) = 0 ) = 0

• But since But since uˆuˆii = Y = Yii − βˆ − βˆ11 − βˆ − βˆ22XXii , the preceding equation reduces to, the preceding equation reduces to

−−2 ˆ2 ˆuuii = 0, whence ¯ˆu = 0 = 0, whence ¯ˆu = 0

Page 15: Econometrics_ch4.ppt

• As a result of the preceding property, the sample regressionAs a result of the preceding property, the sample regression

YYii = = βˆβˆ11 + βˆ + βˆ22XXii +uˆ +uˆii (2.6.2)(2.6.2)

• can be expressed in an alternative form where both can be expressed in an alternative form where both Y and X are expressed Y and X are expressed as as deviationsdeviations from their mean values. To see this, sum (2.6.2) on both sides to from their mean values. To see this, sum (2.6.2) on both sides to give:give:

YYii = n = nβˆβˆ11 + βˆ + βˆ22XXii +uˆ +uˆii

= = nnβˆβˆ11 + βˆ + βˆ22XXii since uˆsince uˆii = 0 = 0 (3.1.11)(3.1.11)

• Dividing Eq. (3.1.11) through by Dividing Eq. (3.1.11) through by n, we obtainn, we obtain

Y¯ = βˆY¯ = βˆ11 + βˆ + βˆ22X¯ X¯ (3.1.12)(3.1.12)

• which is the same as (3.1.7). Subtracting Eq. (3.1.12) from (2.6.2), we obtainwhich is the same as (3.1.7). Subtracting Eq. (3.1.12) from (2.6.2), we obtain

YYii − Y¯ = βˆ − Y¯ = βˆ22(X(Xii − X¯ ) + uˆ − X¯ ) + uˆii

• OrOr

yyii = = βˆβˆ22xxii +uˆ +uˆi i (3.1.13)(3.1.13)

Page 16: Econometrics_ch4.ppt

• Equation (3.1.13) is known as theEquation (3.1.13) is known as the deviation form deviation form. Notice that the . Notice that the intercept term intercept term βˆβˆ11 is no longer present in it. But the intercept term is no longer present in it. But the intercept term can can

always be estimated by (3.1.7), that is, from the fact that the sample always be estimated by (3.1.7), that is, from the fact that the sample regression line passes through the sample means of regression line passes through the sample means of Y and X.Y and X.

• An advantage of the deviation form is that An advantage of the deviation form is that it often simplifies it often simplifies computing computing formulas. In passing, note that in the deviation form, the formulas. In passing, note that in the deviation form, the SRF can be written as:SRF can be written as:

yˆyˆii = = βˆβˆ22xxii (3.1.14)(3.1.14)

• whereas in the original units of measurement it was whereas in the original units of measurement it was YˆYˆii = βˆ = βˆ11 + βˆ + βˆ22XXii , ,

as shown in (2.6.1). as shown in (2.6.1).

Page 17: Econometrics_ch4.ppt

– 4. The residuals ˆ4. The residuals ˆuuii are uncorrelated with the predicted Y are uncorrelated with the predicted Yii . This statement . This statement

can be verified as follows: using the deviation form, we can write: can be verified as follows: using the deviation form, we can write:

– where use is made of the fact that where use is made of the fact that

– 5. 5. The residuals ˆThe residuals ˆuuii are uncorrelated with X are uncorrelated with Xii ; that is, This ; that is, This

fact follows from Eq. (2) in Appendix 3A, Section 3A.1. fact follows from Eq. (2) in Appendix 3A, Section 3A.1.

Page 18: Econometrics_ch4.ppt

THE CLASSICAL LINEAR REGRESSION MODEL: THE ASSUMPTIONS UNDERLYING THE METHOD OF LEAST SQUARES

• In regression analysis our objective is not only to obtain In regression analysis our objective is not only to obtain βˆβˆ11 and and βˆβˆ22

but also to draw but also to draw inferencesinferences about the true about the true ββ11 andand β β22. For example, we . For example, we

would like to know would like to know how how close close βˆβˆ11 and βˆ and βˆ22 are to their counterparts in the are to their counterparts in the

population or how close population or how close YˆYˆii is is to the true to the true E(Y | XE(Y | Xii). ).

• Look at the PRF: Look at the PRF: YYii = β = β11 + β + β22XXii + u + uii . It shows that Y. It shows that Yii depends on depends on both both

XXii and u and uii . The assumptions . The assumptions made about the made about the XXii variable(s) and the variable(s) and the

error term are error term are extremely critical extremely critical to the to the valid interpretation of the valid interpretation of the regression estimates.regression estimates.

• The Gaussian, standard, or The Gaussian, standard, or classical linear regression model classical linear regression model ((CLRMCLRM), ), makes 10 assumptionsmakes 10 assumptions. .

Page 19: Econometrics_ch4.ppt

• Keep in mind that the regressand Keep in mind that the regressand Y and Y and the regressor the regressor X themselves may be X themselves may be nonlinear.nonlinear.

Page 20: Econometrics_ch4.ppt

• look at Table 2.1. Keeping the value of income look at Table 2.1. Keeping the value of income X fixed, say, at X fixed, say, at $80, we $80, we draw at random a family and observe its weekly family consumption draw at random a family and observe its weekly family consumption expenditure expenditure Y as, say, $60. Still keeping X at $80, we draw at random Y as, say, $60. Still keeping X at $80, we draw at random another family and observe its another family and observe its Y value as $75. In each of these Y value as $75. In each of these drawings drawings (i.e., repeated sampling), the value of (i.e., repeated sampling), the value of X is fixed at $80. We X is fixed at $80. We can repeat this can repeat this process for all the process for all the X values shown in Table 2.1. X values shown in Table 2.1.

• This means that our regression analysis is This means that our regression analysis is conditional regression conditional regression analysisanalysis, that is, conditional on the given values of the regressor(s) , that is, conditional on the given values of the regressor(s) X.X.

Page 21: Econometrics_ch4.ppt
Page 22: Econometrics_ch4.ppt

• As shownAs shown in Figure 3.3 in Figure 3.3, each , each Y population corresponding to a given X Y population corresponding to a given X is distributed around its mean value with some is distributed around its mean value with some Y values above the Y values above the mean and some below it. mean and some below it. the mean value of these deviations the mean value of these deviations corresponding to any given corresponding to any given X should be zero.X should be zero.

• Note that the assumption Note that the assumption E(uE(uii | X | Xii) = 0 ) = 0 implies that implies that E(YE(Yii | X | Xii) = β) = β11 + +

ββ22XXii. .

Page 23: Econometrics_ch4.ppt

E(ui | Xi) = 0

Page 24: Econometrics_ch4.ppt

• Technically, Technically, (3.2.2) represents the assumption of (3.2.2) represents the assumption of homoscedasticityhomoscedasticity, or , or equal equal (homo) spread (scedasticity) or equal variance. Stated differently, (3.2.2) (homo) spread (scedasticity) or equal variance. Stated differently, (3.2.2) means that the means that the Y populations corresponding to various X values have the Y populations corresponding to various X values have the same variance. same variance.

• Put simply, Put simply, the variation around the regression line (which is the line of the variation around the regression line (which is the line of average relationship between average relationship between Y and X) is the Y and X) is the samesame across the X across the X valuesvalues; it ; it neither increases or decreases as neither increases or decreases as X variesX varies

Page 25: Econometrics_ch4.ppt
Page 26: Econometrics_ch4.ppt
Page 27: Econometrics_ch4.ppt

• In Figure 3.5, where the conditional variance of the In Figure 3.5, where the conditional variance of the Y Y population population varies with varies with X. This situation is known as X. This situation is known as heteroscedasticityheteroscedasticity, , or or unequal unequal spread, or variance. Symbolically, in this situation spread, or variance. Symbolically, in this situation (3.2.2) can be (3.2.2) can be written aswritten as

• var (uvar (uii | X | Xii) = ) = σσ22ii (3.2.3) (3.2.3)

• Figure 3.5. shows that, var (Figure 3.5. shows that, var (u| Xu| X11) < var (u| X) < var (u| X22), . . . , < var (u| X), . . . , < var (u| Xii). ).

Therefore, the Therefore, the likelihood is that the likelihood is that the Y observations coming from the Y observations coming from the population with X = Xpopulation with X = X11 would be closer to the PRF than those coming would be closer to the PRF than those coming

from populations corresponding from populations corresponding to to X = XX = X22, X = X, X = X33, and so on. In short, , and so on. In short,

notnot all Y all Y values corresponding values corresponding to the various to the various X’s X’s will be equally will be equally reliable, reliability reliable, reliability being judged by being judged by how closely or distantly the how closely or distantly the Y Y values are distributed around their means, that values are distributed around their means, that is, the points on the is, the points on the PRF. PRF.

Page 28: Econometrics_ch4.ppt

• The disturbances The disturbances uuii and u and ujj are uncorrelated, i.e., are uncorrelated, i.e., no serial correlation. This no serial correlation. This

means that, given means that, given XXii , , the deviations of any two Y values the deviations of any two Y values from their mean from their mean

value do not exhibit patternsvalue do not exhibit patterns. . In Figure 3.6a, the u’s are positively correlated, In Figure 3.6a, the u’s are positively correlated, a positive a positive u followed by a positive u or a negative u followed by a u followed by a positive u or a negative u followed by a negative negative u. u. In Figure 3.6b, the u’s are negatively correlated, a positive u In Figure 3.6b, the u’s are negatively correlated, a positive u followed by a followed by a negative negative u and vice versa. u and vice versa. If the disturbances follow systematic patterns, If the disturbances follow systematic patterns, Figure 3.6Figure 3.6a and b, there is auto- or serial correlation.a and b, there is auto- or serial correlation. Figure 3.6 Figure 3.6c shows c shows that that there is no systematic pattern to the there is no systematic pattern to the u’s, thus indicating zero correlation.u’s, thus indicating zero correlation.

Page 29: Econometrics_ch4.ppt
Page 30: Econometrics_ch4.ppt

• Suppose in our PRF (Suppose in our PRF (YYtt = β = β11 + β + β22XXtt + u + utt) that u) that utt and u and ut−1t−1 are positively are positively

correlated. correlated. Then Then YYtt depends not only on X depends not only on Xtt but also on but also on uut−1t−1 for u for ut−1t−1 to to

some extent some extent determines determines uutt. .

Page 31: Econometrics_ch4.ppt

• The disturbance The disturbance u and explanatory variable X u and explanatory variable X are are uncorrelateduncorrelated. The PRF . The PRF assumes that assumes that XX and and u u (which may represent (which may represent the influence of all the omitted the influence of all the omitted variables) have separate (and additive) influence on variables) have separate (and additive) influence on YY. But if X and u are . But if X and u are correlatedcorrelated, it is not possible to assess their , it is not possible to assess their individual effects on individual effects on Y. Thus, if X Y. Thus, if X and u are positively correlated, X increases and u are positively correlated, X increases when when u increases and it decreases u increases and it decreases when u decreases. Similarly, if X and u when u decreases. Similarly, if X and u are negatively correlated, are negatively correlated, X increases X increases when u decreases and it decreases when u decreases and it decreases when when u increases. In either case, it is u increases. In either case, it is difficult to isolate the influence of X difficult to isolate the influence of X and and u on Y.u on Y.

Page 32: Econometrics_ch4.ppt

• In the hypothetical example of Table 3.1, imagine that we had only the first In the hypothetical example of Table 3.1, imagine that we had only the first pair of observations on pair of observations on Y and X (4 and 1). From this single observation there Y and X (4 and 1). From this single observation there is no way to estimate is no way to estimate the two unknowns, the two unknowns, ββ11 and β and β22. We need at least two pairs . We need at least two pairs

of observations of observations to estimate the two unknownsto estimate the two unknowns

Page 33: Econometrics_ch4.ppt

• This assumption too is not so innocuous as it looks. Look at Eq. (3.1.6). If This assumption too is not so innocuous as it looks. Look at Eq. (3.1.6). If all the all the X values are identical, then XX values are identical, then Xii = X¯ and the denominator of = X¯ and the denominator of that that

equation will be zero, making it impossible to estimate equation will be zero, making it impossible to estimate β2 and therefore β1. β2 and therefore β1. Looking at Looking at our family consumption expenditure example in Chapter 2, if our family consumption expenditure example in Chapter 2, if there is very little variation in family income, we will not be able to explain there is very little variation in family income, we will not be able to explain much of the variation in the consumption expenditure. much of the variation in the consumption expenditure.

Page 34: Econometrics_ch4.ppt

• An econometric investigation begins with the specification of the An econometric investigation begins with the specification of the econometric model underlying the phenomenon of interest. Some important econometric model underlying the phenomenon of interest. Some important questionsquestions that arise in the specification of the model include the following: that arise in the specification of the model include the following: (1) What variables should be included in the model? (1) What variables should be included in the model?

• (2) What is the functional form of the model? Is it linear in the parameters, (2) What is the functional form of the model? Is it linear in the parameters, the variables, or both? the variables, or both?

• (3) What are the probabilistic assumptions made about the (3) What are the probabilistic assumptions made about the YYii , the X , the Xii, and , and

the uthe uii entering the model? entering the model?

Page 35: Econometrics_ch4.ppt

• Suppose we choose the following two models to depict the underlying Suppose we choose the following two models to depict the underlying relationship between the rate of change of money wages and the relationship between the rate of change of money wages and the unemployment rate:unemployment rate:

• YYii = = αα11 + α + α22XXii + u + uii (3.2.7)(3.2.7)

• YYii = = ββ11 + β + β22 ((11/X/Xii ) + u ) + uii (3.2.8)(3.2.8)

• where where YYii = the rate of change of money wages, and X = the rate of change of money wages, and Xii = the unemployment = the unemployment

rate. The regression model (3.2.7) is linear both in the parameters and the rate. The regression model (3.2.7) is linear both in the parameters and the variables, whereas (3.2.8) is linear in the parameters (hence a linear variables, whereas (3.2.8) is linear in the parameters (hence a linear regression model by our definition) but nonlinear in the variable regression model by our definition) but nonlinear in the variable X. Now X. Now consider consider Figure 3.7.Figure 3.7.

• If model (3.2.8) is the “correct” or the “true” model, fitting the model (3.2.7) If model (3.2.8) is the “correct” or the “true” model, fitting the model (3.2.7) to the scatterpoints shown in Figure 3.7 will give us wrong predictions. to the scatterpoints shown in Figure 3.7 will give us wrong predictions.

• Unfortunately, in practice one rarely knows the correct variables to include Unfortunately, in practice one rarely knows the correct variables to include in the model or the correct functional form of the model or the correct in the model or the correct functional form of the model or the correct probabilistic assumptions about the variables entering the model for the probabilistic assumptions about the variables entering the model for the theory underlying the particular investigation may not be strong or robust theory underlying the particular investigation may not be strong or robust enough to answer all these questions. enough to answer all these questions.

Page 36: Econometrics_ch4.ppt
Page 37: Econometrics_ch4.ppt

• We will discuss this assumption in Chapter 7, where we discuss multiple We will discuss this assumption in Chapter 7, where we discuss multiple regression models.regression models.

Page 38: Econometrics_ch4.ppt

PRECISION OR STANDARD ERRORS OF LEAST-SQUARES ESTIMATES

• The least-squares estimates The least-squares estimates are a function of the sample dataare a function of the sample data. But since the . But since the data change from sample to sample, the estimates will change. Therefore, data change from sample to sample, the estimates will change. Therefore, what is needed is some measure of “what is needed is some measure of “reliabilityreliability” or precision of the ” or precision of the estimators estimators βˆβˆ11 and and βˆβˆ22. In statistics the precision of an estimate is measured by . In statistics the precision of an estimate is measured by

its its standard standard error (se), error (se), which can be obtained as follows: which can be obtained as follows:

Page 39: Econometrics_ch4.ppt

• σσ22 is the constant is the constant or or homoscedastichomoscedastic variance of variance of uuii of Assumption 4. of Assumption 4.

• σσ22 itself is itself is estimated by the following formula:estimated by the following formula:

• where where ˆ̂σσ22 is the OLS estimator of the true is the OLS estimator of the true but unknown σbut unknown σ22 and where the and where the expression expression n−2 is known as the number of degrees of freedom (df), is n−2 is known as the number of degrees of freedom (df), is the residual sum of squares (the residual sum of squares (RSSRSS). Once ). Once is known, ˆis known, ˆσσ22 can be easily can be easily computed. computed.

• Compared with Eq. (3.1.2), Eq. (3.3.6) is easy to use, for it does not require Compared with Eq. (3.1.2), Eq. (3.3.6) is easy to use, for it does not require computing ˆcomputing ˆuui i for each observation for each observation. .

Page 40: Econometrics_ch4.ppt

• SinceSince

• an an alternative expressionalternative expression for computing is for computing is

• In passing, note that the positive square root of ˆIn passing, note that the positive square root of ˆσσ22

• is known as the is known as the standard error of estimate standard error of estimate or the standard error of the or the standard error of the regression (se). It is simply the standard deviation of the regression (se). It is simply the standard deviation of the Y values about Y values about the the estimated regression line and is estimated regression line and is often used as a summary measure of the often used as a summary measure of the “goodness of fit“goodness of fit” of the estimated regression line.” of the estimated regression line.

Page 41: Econometrics_ch4.ppt

• Note the Note the following features of the variancesfollowing features of the variances (and therefore the standard (and therefore the standard errors) of errors) of βˆβˆ11 and and βˆβˆ22..

• 1. The variance of 1. The variance of βˆβˆ22 is directly proportional to is directly proportional to σσ22 but inversely proportional but inversely proportional

to to xx22ii . That is, given . That is, given σσ22, the larger the variation in the , the larger the variation in the XX values, the values, the smaller smaller

the variance of the variance of βˆβˆ22 and hence the greater the precision with which and hence the greater the precision with which ββ22 can be can be

estimated. estimated.

• 2. The variance of 2. The variance of βˆβˆ11 is directly proportional to is directly proportional to σσ22 and and XX22ii but inversely but inversely

proportional to proportional to xx22ii and the sample size and the sample size nn..

Page 42: Econometrics_ch4.ppt

• 3. Since 3. Since βˆβˆ11 and βˆ and βˆ22 are estimators, they will not only vary from sample to are estimators, they will not only vary from sample to

sample sample but in a given sample they but in a given sample they are likely to be dependent on are likely to be dependent on each other,each other, this dependence being measured by the covariance between them. this dependence being measured by the covariance between them.

• Since var (Since var (βˆβˆ22) is ) is alwaysalways positivepositive, as is the variance of any variable, the , as is the variance of any variable, the nature nature

of the covariance of the covariance between between βˆβˆ11 and and βˆβˆ22 depends on the sign of X¯ . If X¯ is depends on the sign of X¯ . If X¯ is

positive, positive, then as the formula shows, the covariance will be then as the formula shows, the covariance will be negativenegative. Thus, if . Thus, if the slope coefficient the slope coefficient ββ22 is overestimated (i.e., the slope is too steep), the is overestimated (i.e., the slope is too steep), the

intercept intercept coefficient coefficient ββ11 will be underestimated (i.e., the intercept will be too will be underestimated (i.e., the intercept will be too

small).small).

Page 43: Econometrics_ch4.ppt

PROPERTIES OF LEAST-SQUARES ESTIMATORS:THE GAUSS–MARKOV THEOREM

• To understand this theorem, we need to consider the best linear To understand this theorem, we need to consider the best linear unbiasedness property of an estimator. An estimator, say the OLS estimator unbiasedness property of an estimator. An estimator, say the OLS estimator βˆβˆ22, is said to be a best linear unbiased , is said to be a best linear unbiased estimator estimator (BLUE) of (BLUE) of ββ22 if the if the

following hold:following hold:

• 1. 1. It is linearIt is linear, that is, a linear function of a random variable, such as the , that is, a linear function of a random variable, such as the dependent variable dependent variable Y in the regression model.Y in the regression model.

• 2. 2. It is unbiasedIt is unbiased, that is, its , that is, its averageaverage or expected value, or expected value, E(βˆE(βˆ22), is equal to ), is equal to the the

true value, true value, ββ22..

• 3. 3. It has minimum variance It has minimum variance in the class of all such linear unbiased in the class of all such linear unbiased estimators; an unbiased estimator with the least variance is known as an estimators; an unbiased estimator with the least variance is known as an efficient estimator. efficient estimator.

Page 44: Econometrics_ch4.ppt

• What all this means can be explained with the aid of Figure 3.8. In Figure What all this means can be explained with the aid of Figure 3.8. In Figure 3.8(3.8(a) we have shown the sampling distribution of the OLS a) we have shown the sampling distribution of the OLS estimator estimator βˆβˆ22, that , that

is, the distribution of the values taken by βˆis, the distribution of the values taken by βˆ22 in repeated in repeated sampling experiment. sampling experiment.

For convenience we have assumed For convenience we have assumed βˆβˆ22 to be distributed symmetrically to be distributed symmetrically. As the . As the

figure shows, the mean of the figure shows, the mean of the βˆβˆ22 values, E(βˆ values, E(βˆ22), is equal to the true β), is equal to the true β22. In this . In this

situation we say that situation we say that βˆβˆ22 is an unbiased estimator of β is an unbiased estimator of β22. In Figure 3.8(b) we . In Figure 3.8(b) we

have shown the sampling distribution of have shown the sampling distribution of β∗β∗22, an alternative estimator of β, an alternative estimator of β2 2

obtained by using another (i.e., other than OLS) method. obtained by using another (i.e., other than OLS) method.

Page 45: Econometrics_ch4.ppt
Page 46: Econometrics_ch4.ppt

• For convenience, assume that For convenience, assume that β*β*22, like , like βˆβˆ22, is unbiased, that is, its average or , is unbiased, that is, its average or

expected value is expected value is equal to equal to ββ22. Assume further that both βˆ. Assume further that both βˆ22 and β* and β*22 are linear are linear

estimators, that is, they are linear functions of estimators, that is, they are linear functions of Y. Which estimator, βˆY. Which estimator, βˆ22 or β* or β*22, ,

would you choose? To answer this question, superimpose the two figures, as would you choose? To answer this question, superimpose the two figures, as in Figure 3.8(in Figure 3.8(c). c). It is obvious that although both It is obvious that although both βˆβˆ22 and β* and β*2 are unbiased 2 are unbiased

the distribution of the distribution of β*β*22 is more diffused or widespread around the mean is more diffused or widespread around the mean

value than the distribution of value than the distribution of βˆβˆ22. In other words, the variance of β*. In other words, the variance of β*22 is is

larger than the variance of larger than the variance of βˆβˆ22..

• Now given two estimators that are both linear and unbiased, one would Now given two estimators that are both linear and unbiased, one would choose the estimator with the smaller variance because it is more likely to choose the estimator with the smaller variance because it is more likely to be close to be close to ββ22 than the alternative estimator. than the alternative estimator. In short, one would choose the In short, one would choose the

BLUE estimator.BLUE estimator.

Page 47: Econometrics_ch4.ppt

THE COEFFICIENT OF DETERMINATION r2:A MEASURE OF “GOODNESS OF FIT”

• We now consider the We now consider the goodness of fit goodness of fit of the fitted regression line to a set of of the fitted regression line to a set of data; that is, we shall find out how “well” the sample regression line fits the data; that is, we shall find out how “well” the sample regression line fits the data. The coefficient of determination data. The coefficient of determination rr22 (two-variable case) or (two-variable case) or RR22 (multiple (multiple regression) is a summary regression) is a summary measure that tells how measure that tells how well the sample regression well the sample regression line fits the dataline fits the data..

• Consider a heuristic explanation Consider a heuristic explanation of of rr22 in terms of a graphical device, known in terms of a graphical device, known as the Venn diagram as the Venn diagram shown in Figure 3.9.shown in Figure 3.9.

• In this figure the circle In this figure the circle Y represents variation in the dependent variable Y Y represents variation in the dependent variable Y and and the circle the circle X represents variation in the explanatory variable X. The X represents variation in the explanatory variable X. The overlap of overlap of the two circles indicates the extent to which the variation in the two circles indicates the extent to which the variation in Y is explained Y is explained by the variation in X.by the variation in X.

Page 48: Econometrics_ch4.ppt
Page 49: Econometrics_ch4.ppt

• To compute this To compute this rr22, we proceed as follows: Recall that, we proceed as follows: Recall that

• YYii = Yˆ = Yˆii +uˆ +uˆii (2.6.3)(2.6.3)

• or in the deviation formor in the deviation form

• yyii = ˆy = ˆyii + ˆu + ˆuii (3.5.1)(3.5.1)

• where use is made of (3.1.13) and (3.1.14). Squaring (3.5.1) on both sides where use is made of (3.1.13) and (3.1.14). Squaring (3.5.1) on both sides and summing over the sample, we obtainand summing over the sample, we obtain

• Since Since = 0 and yˆ = 0 and yˆii = βˆ = βˆ22xxii . .

Page 50: Econometrics_ch4.ppt

• The various sums of squares appearing in (3.5.2) can be described as The various sums of squares appearing in (3.5.2) can be described as follows: follows: = = total variation of the actual Y values about their total variation of the actual Y values about their sample meansample mean, which may be called the total sum of squares (, which may be called the total sum of squares (TSSTSS).).

• = variation of the estimated Y = variation of the estimated Y values about their mean (¯ˆvalues about their mean (¯ˆY = Y¯), which appropriately may be called the Y = Y¯), which appropriately may be called the sum of squares due to/or explained by regression, or simply the sum of squares due to/or explained by regression, or simply the explained explained sum of squares (ESS). sum of squares (ESS). = residual or unexplained variation of the = residual or unexplained variation of the Y Y values about the regression values about the regression line, or simply the line, or simply the residual sum of squares (RSS). residual sum of squares (RSS). Thus, (3.5.2) is Thus, (3.5.2) is

• TSS = ESS + RSS TSS = ESS + RSS (3.5.3) (3.5.3)

• and shows that the total variation in the observed and shows that the total variation in the observed Y values about their mean Y values about their mean value can be partitioned into two parts, one attributable to the regression value can be partitioned into two parts, one attributable to the regression line and the other to random forces because not all actual line and the other to random forces because not all actual Y observations lie Y observations lie on the fitted line. Geometrically, we have Figure 3.10on the fitted line. Geometrically, we have Figure 3.10

Page 51: Econometrics_ch4.ppt
Page 52: Econometrics_ch4.ppt
Page 53: Econometrics_ch4.ppt

• The quantity The quantity rr22 thus defined is known as the (sample) coefficient of thus defined is known as the (sample) coefficient of determination determination and is the most commonly used measure of the goodness of fit and is the most commonly used measure of the goodness of fit of a regression line. Verbally, of a regression line. Verbally, rr22 measures the proportion or measures the proportion or percentage percentage of the of the total variation in Y explained by the regression modeltotal variation in Y explained by the regression model..

• Two properties of Two properties of rr22 may be noted may be noted: :

• 1. 1. It is a nonnegative quantityIt is a nonnegative quantity. .

• 22. Its limits are 0 ≤ . Its limits are 0 ≤ rr22 ≤ 1. ≤ 1. An rAn r22 of 1 means a perfect fit, that is, Yˆ of 1 means a perfect fit, that is, Yˆ ii = Y = Yii for for

each each i. On the other hand, an ri. On the other hand, an r22 of zero means that there is no relationship of zero means that there is no relationship between the regressand and the regressor whatsoever (i.e., between the regressand and the regressor whatsoever (i.e., βˆβˆ22 = 0). In = 0). In this this

case, as (3.1.9) shows, case, as (3.1.9) shows, YˆYˆii = βˆ = βˆ11 = Y¯, that is, the best prediction of any Y = Y¯, that is, the best prediction of any Y value value

is simply its mean value. In this situation therefore the regression line will is simply its mean value. In this situation therefore the regression line will be horizontal to the be horizontal to the X axis.X axis.

• Although Although rr22 can be computed directly from its definition given in (3.5.5), can be computed directly from its definition given in (3.5.5), it it can be obtained more quickly from the following formula:can be obtained more quickly from the following formula:

Page 54: Econometrics_ch4.ppt
Page 55: Econometrics_ch4.ppt
Page 56: Econometrics_ch4.ppt
Page 57: Econometrics_ch4.ppt

• Some of the Some of the properties of properties of r r are as follows (see Figure 3.11): are as follows (see Figure 3.11):

• 1. 1. It can be positive or negativeIt can be positive or negative, ,

• 2. 2. It lies between the limits of −1 and +1It lies between the limits of −1 and +1; that is, −1 ≤ ; that is, −1 ≤ r ≤ 1. r ≤ 1.

• 3. 3. It is symmetrical in natureIt is symmetrical in nature; that is, the coefficient of correlation between ; that is, the coefficient of correlation between X X and Y(rand Y(rXYXY) is the same as that between Y and X(r) is the same as that between Y and X(rYXYX).).

• 4. 4. It is independent of the origin and scaleIt is independent of the origin and scale; that is, if we define ; that is, if we define X*X*ii = aX = aXii + C + C

and Y*and Y*ii = bY = bYii + d, where a > 0, b > 0, and c and d are constants, + d, where a > 0, b > 0, and c and d are constants, then then r r

between X* and Y* is the same as that between the original variables X and Y.between X* and Y* is the same as that between the original variables X and Y.

• 5. 5. If X and Y are statistically independentIf X and Y are statistically independent, , the correlation coefficient between the correlation coefficient between them is zero; but if them is zero; but if r = 0r = 0, it does , it does not mean that two variables are not mean that two variables are independent. independent.

• 6. 6. It is a measure of It is a measure of linear linear association or linear dependence only; it has no association or linear dependence only; it has no meaning for describing nonlinear relations. meaning for describing nonlinear relations.

• 7. Although it is a measure of linear association between two variables, it 7. Although it is a measure of linear association between two variables, it does not necessarily imply any cause-and-effectdoes not necessarily imply any cause-and-effect relationship. relationship.

Page 58: Econometrics_ch4.ppt
Page 59: Econometrics_ch4.ppt

• In the regression context, In the regression context, rr22 is a more meaningful measure than r, for is a more meaningful measure than r, for the the former tells us the proportion of variation in the dependent variable former tells us the proportion of variation in the dependent variable explained by the explanatory variable(s) and therefore provides an overall explained by the explanatory variable(s) and therefore provides an overall measure of the extent to which the variation in one variable determines the measure of the extent to which the variation in one variable determines the variation in the other. The latter does not have such value. Moreover, as we variation in the other. The latter does not have such value. Moreover, as we shall see, the interpretation of shall see, the interpretation of r (= R) in a multiple regression model is r (= R) in a multiple regression model is of of dubious valuedubious value. . In passing, note that the In passing, note that the rr22 defined previously can also be defined previously can also be computed as the squared coefficient of correlation between actual Ycomputed as the squared coefficient of correlation between actual Y ii and the and the

estimated Yestimated Yii , , namely, namely, YˆYˆii . That is, using (3.5.13), we can write . That is, using (3.5.13), we can write

Page 60: Econometrics_ch4.ppt

• where where YYii = actual Y, Yˆ = actual Y, Yˆii = estimated Y, and Y¯ = Y¯ˆ = the mean of Y. For = estimated Y, and Y¯ = Y¯ˆ = the mean of Y. For

proof, see exercise 3.15. Expression (3.5.14) justifies the description of proof, see exercise 3.15. Expression (3.5.14) justifies the description of rr22 as as a measure of goodness of fit, for it tells how close the estimated a measure of goodness of fit, for it tells how close the estimated Y values are Y values are to their actual values.to their actual values.

Page 61: Econometrics_ch4.ppt

A NUMERICAL EXAMPLE

Page 62: Econometrics_ch4.ppt

• βˆβˆ11 = 24.4545 = 24.4545 var (var (βˆβˆ11) = 41.1370 ) = 41.1370 and and se (se (βˆβˆ11) = 6.4138) = 6.4138

• βˆβˆ22 = 0.5091 = 0.5091 var (var (βˆβˆ22) = 0.0013 ) = 0.0013 and and se (se (βˆβˆ22) = 0.0357) = 0.0357

• cov (cov (βˆβˆ11, βˆ, βˆ22) = −0.2172 ) = −0.2172 σˆσˆ22 = 42.1591 = 42.1591 (3.6.1)(3.6.1)

• rr22 = 0.9621 = 0.9621 r = 0.9809 r = 0.9809 df = 8df = 8

• The estimated regression line therefore isThe estimated regression line therefore is

• YˆYˆii = 24.4545 + 0.5091X = 24.4545 + 0.5091X ii (3.6.2) (3.6.2)

• which is shown geometrically as Figure 3.12. which is shown geometrically as Figure 3.12.

• Following Chapter 2, the SRF [Eq. (3.6.2)] and the associated regression line Following Chapter 2, the SRF [Eq. (3.6.2)] and the associated regression line are interpreted as follows: Each point on the regression line gives an are interpreted as follows: Each point on the regression line gives an estimate estimate of the expected or mean value of Y corresponding to the chosen X of the expected or mean value of Y corresponding to the chosen X value; that is, value; that is, YˆYˆii is an estimate of E(Y | X is an estimate of E(Y | X ii). The value of βˆ). The value of βˆ22 = 0.5091, which = 0.5091, which measures the measures the

slope of the line, shows that, within the sample range of slope of the line, shows that, within the sample range of X X between $80 and between $80 and $260 per week, as $260 per week, as X increases, say, by $1, the estimated X increases, say, by $1, the estimated increase in the mean or increase in the mean or average weekly consumption expenditure amounts to about 51 cents. The average weekly consumption expenditure amounts to about 51 cents. The value of value of βˆβˆ11 = 24.4545, which is the intercept of the = 24.4545, which is the intercept of the line, indicates the average line, indicates the average

level of weekly consumption expenditure when weekly income is zero. level of weekly consumption expenditure when weekly income is zero.

Page 63: Econometrics_ch4.ppt
Page 64: Econometrics_ch4.ppt
Page 65: Econometrics_ch4.ppt

• However, this is a mechanical interpretation of the intercept term. In However, this is a mechanical interpretation of the intercept term. In regression analysis such literal interpretation of the intercept term may not regression analysis such literal interpretation of the intercept term may not be always meaningful, although in the present example it can be argued be always meaningful, although in the present example it can be argued that a family without any income (because of unemployment, layoff, etc.) that a family without any income (because of unemployment, layoff, etc.) might maintain some minimum level of consumption expenditure either by might maintain some minimum level of consumption expenditure either by borrowing or dissaving. But in general one has to use common sense in borrowing or dissaving. But in general one has to use common sense in interpreting the intercept term, for very often the sample range of interpreting the intercept term, for very often the sample range of X values X values may not include zero as one of the observed values. may not include zero as one of the observed values. Perhaps it is best to Perhaps it is best to interpret the intercept term as the mean or average effect on interpret the intercept term as the mean or average effect on Y of all the Y of all the variables omitted from the regression model. The value variables omitted from the regression model. The value of of r 2 of 0.9621 means r 2 of 0.9621 means that about 96 percent of the variation in the weekly that about 96 percent of the variation in the weekly consumption expenditure consumption expenditure is explained by income. Since is explained by income. Since r 2 can at most be 1, r 2 can at most be 1, the observed the observed r 2 suggests r 2 suggests that the sample regression line fits the data very that the sample regression line fits the data very well.26 The coefficient of well.26 The coefficient of correlation of 0.9809 shows that the two variables, consumption correlation of 0.9809 shows that the two variables, consumption expenditure and income, are highly positively correlated. The estimated expenditure and income, are highly positively correlated. The estimated standard errors of the regression coefficients will be interpreted in Chapter standard errors of the regression coefficients will be interpreted in Chapter 5.5.

Page 66: Econometrics_ch4.ppt

• See numerical exapmles 3.1-3.3See numerical exapmles 3.1-3.3