inference for 0 and 1 confidence intervals and hypothesis tests
DESCRIPTION
Example 2: Relation between state latitude and skin cancer mortality? # State LAT MORT 1 Alabama Arizona Arkansas California Colorado ! 49 Wyoming Mortality rate of white males due to malignant skin melanoma from LAT = degrees (north) latitude of center of state MORT = mortality rate due to malignant skin melanoma per 10 million peopleTRANSCRIPT
Inference for0 and 1
Confidence intervals and hypothesis tests
Example 1: Relation between leg strength and punting distance?
PUNTER LEG DIST 1 170 162.50 2 140 144.00 3 180 147.50 4 160 163.50 5 170 192.00 6 150 171.75 7 170 162.00 8 110 104.83 9 120 105.67 10 130 117.58 11 120 140.25 12 140 150.16 13 160 165.16
•13 punters in American football
•DIST = average length (feet) of 10 punts
•LEG = strength of leg (pounds lifted)
Example 2: Relation between state latitude and skin cancer mortality?
# State LAT MORT1 Alabama 33.0 2192 Arizona 34.5 1603 Arkansas 35.0 1704 California 37.5 1825 Colorado 39.0 149 49 Wyoming 43.0 134
•Mortality rate of white males due to malignant skin melanoma from 1950-1959.
•LAT = degrees (north) latitude of center of state
•MORT = mortality rate due to malignant skin melanoma per 10 million people
Example 3: Relation between use of amphetamines and food consumption?
X = Amphetamine dose (mg/kg)
0 2.5 5.0 112.6 73.3 38.5 102.1 84.8 81.3 90.2 67.3 57.1 81.5 55.3 62.3 105.6 80.7 51.5 93.0 90.0 48.3 106.6 75.5 42.7 108.3 77.1 57.9
•24 rats randomly allocated to dose of amphetamine (saline (0), 2.5, and 5.0 mg/kg)
•Y = amount of food (grams of food consumed per kilogram of body weight) in following 3-hour period
Point estimates b0 and b1
n
ii
n
iii
xx
yyxxb
1
2
11 xbyb 10
The b0 and b1 values vary. They depend on the particular (xi, yi)sample obtained.
Assumptions about error terms i
• E(i) = 0 i and j are uncorrelated
• Var(i) = 2
• (New!!) i are normally distributed
NOTE: All results thus far (such as least squares estimates, Gauss-Markov Theorem, and mean square error) only dependon first three assumptions.Today’s results depend on normality of the error terms.
Sampling distribution of b1
b1 is normally distributed
Providing error terms i are normally distributed:
with mean 1
and variance
n
ii xx
1
2
2
Recall: Confidence interval for using
nstx
n )1(,2
Sample estimate ± margin of error
Confidence interval for 1 using b1
22,2
1
xx
MSEtbi
n
Sample estimate ± margin of error
Recall: Hypothesis testing for using
The null (H0: = 0) versus the alternative (HA: ≠ 0)
ns
xt 0* Test statistic
P-value = How likely is it that we’d get a test statistic t* asextreme as we did if the null hypothesis is true?
The P-value is determined by comparing the test statistic t* to a t-distribution with n-1 degrees of freedom.
Hypothesis testing for 1 using b1
The null (H0: 1 = ) versus the alternative (HA: 1 ≠ )
2
1*
xxMSE
bt
i
Test statistic
P-value = How likely is it that we’d get a test statistic t* asextreme as we did if the null hypothesis is true?
The P-value is determined by comparing the test statistic t* to a t-distribution with n-2 degrees of freedom.
Sampling distribution of b0
b0 is normally distributed
Providing error terms i are normally distributed:
with mean 0
and variance
n
ii xx
xn
1
2
22 1
Confidence interval for 0 using b0
2
2
2,20
1
xx
xn
MSEtbi
n
Sample estimate ± margin of error
Hypothesis testing for 0 using b0
The null (H0: 0 = ) versus the alternative (HA: 0 ≠ )
2
20
1*
xx
xn
MSE
bt
i
Test statistic
P-value = How likely is it that we’d get a test statistic t* asextreme as we did if the null hypothesis is true?
The P-value is determined by comparing the test statistic t* to a t-distribution with n-2 degrees of freedom.
180170160150140130120110
200
190
180
170
160
150
140
130
120
110
100
leg
punt
180170160150140130120110
190
180
170
160
150
140
130
120
110
100
leg
punt
S = 16.5841 R-Sq = 62.7 % R-Sq(adj) = 59.3 %
punt = 14.9065 + 0.902664 leg
Regression Plot
Example 1: Inference
The regression equation ispunt = 14.9 + 0.903 leg
Predictor Coef SE Coef T PConstant 14.91 31.37 0.48 0.644leg 0.9027 0.2101 4.30 0.001
S = 16.58 R-Sq = 62.7% R-Sq(adj) = 59.3%
Unusual ObservationsObs leg punt Fit SE Fit Residual St Resid 3 180 147.50 177.39 8.20 -29.89 -2.07R
R denotes an observation with a large standardized residual
504030
200
150
100
Latitude
Mor
talit
y
504030
200
150
100
Latitude
Mor
talit
yS = 19.1150 R-Sq = 68.0 % R-Sq(adj) = 67.3 %
Mortality = 389.189 - 5.97764 Latitude
Regression Plot
Example 2: InferenceThe regression equation isMortality = 389 - 5.98 Latitude
Predictor Coef SE Coef T PConstant 389.19 23.81 16.34 0.000Latitude -5.9776 0.5984 -9.99 0.000
S = 19.12 R-Sq = 68.0% R-Sq(adj) = 67.3%
Unusual ObservationsObs Latitude Mortality Fit SE Fit Residual St Resid 7 39.0 200.00 156.06 2.75 43.94 2.32R 9 28.0 197.00 221.82 7.42 -24.82 -1.41 X 30 35.0 141.00 179.97 3.85 -38.97 -2.08R
R denotes an observation with a large standardized residualX denotes an observation whose X value gives it large influence.
543210
115
105
95
85
75
65
55
45
35
dose
cons
umpt
ion
543210
110
100
90
80
70
60
50
40
dose
cons
umpt
ion
S = 11.4015 R-Sq = 73.9 % R-Sq(adj) = 72.8 %
consumption = 99.3313 - 9.0075 dose
Regression Plot
Example 3: InferenceThe regression equation isconsumption = 99.3 - 9.01 dose
Predictor Coef SE Coef T PConstant 99.331 3.680 26.99 0.000dose -9.008 1.140 -7.90 0.000
S = 11.40 R-Sq = 73.9% R-Sq(adj) = 72.8%
Unusual ObservationsObs dose consumpt Fit SE Fit Residual St Resid 18 5.00 81.30 54.29 3.68 27.01 2.50R
R denotes an observation with a large standardized residual.
Inference for 0 and 1 in Minitab
• Select Stat.• Select Regression.• Select Regression …• Specify Response (y) and Predictor (x).• Click on OK.