ch 8 residual analysistopost

20
7/23/2019 Ch 8 Residual Analysistopost http://slidepdf.com/reader/full/ch-8-residual-analysistopost 1/20  Residual Analysis Chapter 8

Upload: ajay-kumar

Post on 19-Feb-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 1/20

 

Residual AnalysisChapter 8

Page 2: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 2/20

 

Model Assumptions Independence (response variables yi are

independent)- this is a design issue

 Normality (response variables are normallydistributed)

Homoscedasticity (the response variables

have the same variance)

Page 3: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 3/20

 

Best way to chec assumptions! chec the

assumptions on the random errors "hey are independent

"hey are normally distributed

"hey have a constant variance σ2 for all settings of the

independent variables (Homoscedasticity)

"hey have a zero mean.

I# these assumptions are satis#ied$ we may use the normal density

as the woring appro%imation #or the random component& 'o$the residuals are distributed as!

i  N(*$+,)

Page 4: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 4/20

 

lotting Residuals "o chec #or homoscedasticity (constant variance)!

roduce a scatterplot o# the standardi.ed residuals against

the #itted values&

roduce a scatterplot o# the standardi.ed residuals against

each o# the independent variables&

I# assumptions are satis#ied$ residuals should vary

randomly around .ero and the spread o# the residualsshould be about the same throughout the plot (no

systematic patterns&)

Page 5: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 5/20

 

Homoscedasticity is probably violated i#/

"he residuals seem to increase or decrease inaverage magnitude with the #itted values$ it is

an indication that the variance o# the residualsis not constant&

"he points in the plot lie on a curve around

.ero$ rather than #luctuating randomly& A #ew points in the plot lie a long way #rom

the rest o# the points&

Page 6: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 6/20

 

Heteroscedasticity  Not #atal to an analysis0 the analysis is

weaened$ not invalidated& 1etected with scatterplots and recti#ied through

trans#ormation&

http!22www&p#c&c#s&nrcan&gc&ca2pro#iles2wulder2mvstats2trans#orm3e&html

http!22www&ru#&rice&edu2lane2stat3sim2trans#ormations2inde%&html

Page 7: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 7/20

 

4autions with trans#ormations 1i##iculty o# interpretation o# trans#ormed

variables&

"he scale o# the data in#luences the utility o#trans#ormations& I# a scale is arbitrary$ a trans#ormation can be more

e##ective

I# a scale is meaning#ul$ the di##iculty o# interpretationincreases

Page 8: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 8/20

 

 Normality "he random errors can be regarded as a random

sample #rom a N(*$+,) distribution$ so we can chec

this assumption by checing whether the residuals

might have come #rom a normal distribution&

5e should loo at the standardi.ed residuals

6ptions #or looing at distribution!

Histogram$ 'tem and lea# plot$ Normal plot o# residuals

http!22statmaster&sdu&d2courses2st7772module*82inde%&html

Page 9: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 9/20

 

Histogram or 'tem and 9ea# lot 1oes distribution o# residuals appro%imate a

normal distribution:

Regression is robust with respect to

nonnormal errors (in#erences typically still

valid unless the errors come #rom a highlysewed distribution)

Page 10: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 10/20

 

 Normal lot o# Residuals A normal probability plot is #ound by plotting the

residuals o# the observed sample against the

corresponding residuals o# a standard normal

distribution N (*$7) I# the plot shows a straight line$ it is reasonable to

assume that the observed sample comes #rom a normal

distribution&

I# the points deviate a lot #rom a straight line$ there is

evidence against the assumption that the random errors

are an independent sample #rom a normal distribution& http!22www&symar&com2resources2tools2normal3test3plot&asp

Page 11: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 11/20

 

!;ariance o# the random error  ε I# this value e<uals zero

all random errors e<ual *

rediction e<uation ( ŷ) will be e<ual to mean value E(y) I# this value is large

9arge (absolute values) o# ε

9arger deviations between ŷ and the mean value E(y).

"he larger the value o# $ the greater the error in estimating

the model parameters and the error in predicting a value o#

 y #or speci#ic values o# x&

,σ  

,σ  

Page 12: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 12/20

 

;ariance is estimated with (M'=)

"he units o# the estimated variance are squared  units o# the dependent variable y. 

>or a more meaning#ul measure o# variability$we use s or Root M'=.

"he interval ?, s provides a rough estimationwith which the model will predict #uturevalues o# y #or given values o# %& 

, s

Page 13: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 13/20

 

5hy do residual analysis: >ollowing any modeling procedure$ it is a

good idea to assess the validity o# your model&

Residuals and diagnostic statistics allow youto identi#y patterns that are either poorly #it bythe model$ have a strong in#luence upon theestimated parameters$ or which have a high

leverage& It is help#ul to interpret thesediagnostics @ointly to understand any potential problems with the model&

Page 14: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 14/20

 

6utliers 5hat!

An observation with a residual that is larger than  s or astandardi.ed residual larger than (absolute value)

5hy! A data entry or recording error$ 'ewness o# the

distribution$ 4hance$ nassignable causes

"hen what:

=liminate: 4orrect: Analy.e them: How much in#luence do they have: How do I now: Minitab 'torage! 1iagnostics (9everages$ 4ooCs

1istance$ 1>>I"')

Page 15: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 15/20

 

9everages- p& 8* Identi#ies observations with unusual or outlying x-

values&

9everages #all between * and 7& A value greater than,(p2n) is large enough to suggest you should

e%amine the corresponding observation&  p! number o# predictors (including constant)

n! number o# observations Minitab identi#ies observations with leverage over

(p2n) with an D in the table o# unusual observations

Page 16: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 16/20

 

4ooCs 1istance (1)- p& 8*E An overall measure o# the combined impact o# each

observation on the #itted values& 4alculated using

leverage values and standardi.ed residuals&

4onsiders whether an observation is unusual with

respect to both x- and y-values&

6bservations with large 1 values may be outliers&

4ompare 1 to >-distribution with (p$ n-p) degrees o##reedom& 1etermine corresponding percentile& 9ess than ,*F- little in#luence

Greater than E*F- ma@or in#luence

Page 17: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 17/20

 

1>>I"'- p& 8* "he di##erence between the predicted value when all

observations are included and when the ith observation isdeleted&

4ombines leverage and 'tudenti.ed residual (deleted tresiduals) into one overall measure o# how unusual anobservation is

Represent roughly the number o# standard deviations that the#itted value changes when each case is removed #rom the dataset& 6bservations with 1>>I"' values greater than , times

the s<uare root o# (p2n) are considered large and should bee%amined&  p is number o# predictors (including the constant) n is the number o# observations

Page 18: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 18/20

 

'ummary o# what to loo #or  'tart with the plot and brush outliers

9oo #or values that stand out in diagnostic measures

Rules o# thumb 9everages (HI)! values greater than ,(p2n)

4ooCs 1istance! values greater than E*F o# comparable >

(p$ n-p) distribution

1>>I"'! values greater than   n

 p

,

Page 19: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 19/20

 

9etCs do an e%ample

Returning to the Naval Base data set

Page 20: Ch 8 Residual Analysistopost

7/23/2019 Ch 8 Residual Analysistopost

http://slidepdf.com/reader/full/ch-8-residual-analysistopost 20/20

 

>inally$ are the residuals correlated:

1urbin-5atson d  statistic (p& 87E)

Range! * d   8

ncorrelated! d is close to , ositively correlated! d is closer to .ero

 Negatively correlated! d is closer to 8&