7djdud5y2fia6ch 6 multicollinearity&heterosced (1)

Upload: rohan-deepika-rawal

Post on 06-Apr-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    1/23

    Gujarati(2003): Chapter 10 and 11

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    2/23

    Multicollinearity

    One of the assumptions of the CNLRMrequires that THERE IS NO EXACTLINEAR RELATIONSHIP between any of

    the explanatory variables

    i.e non of the X variables are perfectlycorrelated. If so, we say there is perfect

    multicollinearity.

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    3/23

    The nature ofmulticollinearity

    Ifmulticollinearity is perfect, the regression coefficients of

    the Xi variables, is, are indeterminate and their standarderrors, Se(i)s, are infinite.

    In general:When there are some functional relationships

    among independent variables, that is iXi = 0or 1X1+ 2X2 + 3X3++ iXi = 0Such as 1X1+ 2X2= 0 X1= -2X2

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    4/23

    The nature of multicollinearity

    If all the Xs are uncorrelated, there is ABSENCE

    OF MULTICOLLINEARITY.

    Cases in between are described by variousdegress of multicollinearity.

    High degree of multicollinearity occurs when anX is highly correlated with another X or with alinear combination of other Xs.

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    5/23

    The nature of multicollinearity

    So, we study

    - nature of multicollinearity

    - consequencirs of multicollinearity- detection and remedies

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    6/23

    Multicollinearity

    NOTE:1) Multicollinearity is a question of degree notof kind. It is not a matter of irs absence of

    presence. Its degree is important.

    2) It is a feature of the sample, not of thepopulation. It is due to the assumption of

    nonstochastic Xs. We do not test formulticollinearity but we can measure itsdegree in a particular sample.

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    7/23

    Multicollinearity

    If the X variables in a multiplr regressionmodel are highly correlated, we say thatdegree of multicollinearity is HIGH.

    Then,variances of estimated coefficientsare large, t-values are low and soestimates ate highly unreliable.

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    8/23

    Consequences of imperfect multicollinearity

    5. The OLS estimators and their standard errors

    can be sensitive to small change in the data.

    Can be

    detected from

    regression

    results

    1. Although the estimated coefficients are BLUE,

    OLS estimators have large variances and

    covariances, making the estimation with

    less accuracy.

    2. The estimation confidence intervals tend to bemuch wider, leading to accept the zero null

    hypothesis more readily.

    3. The t-statistics of coefficients tend to be

    statistically insignificant.4. The R2 can be very high.

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    9/23

    Detection of high degree of multicollinearitySymptoms

    Using the feature of imperfect (or high degree of)multicollinerity summarised under item 5, we candelete or add a few observations from the data set andexamine the sensitivity of the coefficient estimates and

    their standard errors to the changes. i.e are thecoefficient estimates and their standar errors changesubstantially?

    Based on large , t-values are low leading to

    unprecise estimates. This is the reason why we mayget high significant F-test bu t-tests may beinsignificant.

    )(. es

    2R

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    10/23

    Difficulty in detecting high degree ofmulticollinearity

    1) However, although we may have significant t-ratios and significant F-test, still it is possible tohave high degree of multicollinearity.

    2) High variances (so standard errors) andcovariances of estimated coefficients may alsooccur due to small variation in Xs i.e due large

    error variances

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    11/23

    Difficulty in detecting high degree ofmulticollinearity

    In a three-variable case:

    From the formula for

    uXXY 33221

    2

    )1()()(

    2

    23

    2

    2

    2

    2

    32

    2

    3

    2

    2

    22

    3

    2rxxxxx

    xVar

    uu

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    12/23

    Difficulty in detecting high degree ofmulticollinearity

    where is obtained from the OLS regression of

    on as

    It can be seen that high variance of depends on 3factors

    1) high error variance,

    2) small dispersion in Xs,3) correlation between ,

    2

    23r

    2X 3212 XX3X

    2

    2

    u

    2

    ix

    32 & XX 23r

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    13/23

    Remedial Measures

    1. Utilise a priori information

    2. Combining cross-sectional and time-series data

    3. Dropping a variable(s) and re-specify the regression

    4. Transformation of variables:

    (i) First-difference form

    (ii) Ratio transformation

    5. Additional or new data

    6. Reducing collinearity in polynomial regression

    7. Do nothing (if the objective is only for prediction)

    '33221

    uXXY

    uXXY 33221

    uXX 32221

    1.0

    uXX )1.0( 3221 uZ

    21

    23 1.0 given

    ')()1

    (3

    3

    2

    2

    3

    1

    3

    uX

    X

    XX

    Y

    '33

    2

    221uXXY '2

    33221uXXY

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    14/23

    Insignificant

    Significant

    Since GDP and GNPare highly related

    Other examples: CPI WPI;

    CD rate TB rate

    M2 M3

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    15/23

    Heteroscedasticity

    One of the assumptions of the CNLRM was that errorswere homoscedastic (equally spread)

    22

    2

    )(

    )(

    uE

    uVar

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    16/23

    The probability density function for Yi at two

    levels of family income, X

    i , are identical.

    Homoscedasticity Case

    .

    .

    x1ix11=80 x13=10

    0

    f(Yi)

    income

    Var(ui) = E(ui2)= 2

    .

    x12=90

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    17/23

    .

    ..

    ..

    ..

    ...

    .

    ..

    ..

    ..

    . .. .... .

    ..

    . .

    . ...

    .

    ...

    ..

    .

    .

    xi

    yi

    0

    Homoscedastic pattern of errors

    The scattered points spread out quite equally

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    18/23

    The variance of Yi increases as family income,

    Xi, increases.

    Heteroscedasticity Case

    .

    x1x11

    x12

    f(Yi)

    x13

    .

    .

    income

    Var(ui) = E(ui2)= i2

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    19/23

    Heteroscedastic pattern of errors.

    .

    .

    . .. .

    ..

    ..

    ....

    ..

    ..

    .

    ..

    .

    ..

    ...

    .....

    ..

    .

    ..

    ... .. . .

    .

    xt

    yt

    0The scattered points spread out quite unequally

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    20/23

    Definition of Heteroscedasticity:

    Two-variable regression: Y = 1 + 2 X + ui2 = = ki Y =ki (1 + 2 X + ui)

    xyx2^

    2 = 2 + ki ui E(2) = 2 unbiased^ ^

    if12

    = 22

    = 32

    = i.e.,homoscedasticity

    2

    xi2

    Var (2)=^

    if12

    22

    32

    i.e.,heteroscedasticityVar (2) =^

    Var(ui) = E(ui2) = i22

    1iiXk 0ik

    ^

    xi2i2

    ( xi2)2

    =

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    21/23

    Consequences of heteroscedasticity

    1. OLS estimators are stilllinear and unbiased

    2. Var( i )s are not minimum.^

    =>not the best=>not efficiency=>not BLUE

    4. 2 = is biased, E(2) 2ui

    2^^

    n-k^

    5. t and F statistics are unreliable. Y = 0 + 1 X + u

    Cannot be min.

    SEE = RSS = u 2^ ^

    3. Var ( 2) = instead of Var( 2) =x ii2

    (xi2)2

    ^ ^ 2x2

    Two-variablecase

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    22/23

    Detection of heteroscedasticity

    1.Graphical method :

    plot the estimated residual ( ui ) or squared (ui2 ) against the

    predicted dependent Variable (Yi) or any independent

    variable(Xi).

    ^ ^^

    Observe the graph whether there is a systemic pattern as:

    Y^

    u 2^

    Yes,

    heteroscedasticity exists

  • 8/3/2019 7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced (1)

    23/23

    Detection of heteroscedasticity: Graphical method

    Y

    u2^no heteroscedasticity

    Y

    u2^ yes

    Y

    u2^ yes

    Y^

    u2^yes

    Y^

    u2^yes

    Y^

    u2^yes