inference for 0 and 1 confidence intervals and hypothesis tests

Inference for0 and 1

Confidence intervals and hypothesis tests

Example 1: Relation between leg strength and punting distance?

PUNTER LEG DIST 1 170 162.50 2 140 144.00 3 180 147.50 4 160 163.50 5 170 192.00 6 150 171.75 7 170 162.00 8 110 104.83 9 120 105.67 10 130 117.58 11 120 140.25 12 140 150.16 13 160 165.16

•13 punters in American football

•DIST = average length (feet) of 10 punts

•LEG = strength of leg (pounds lifted)

Example 2: Relation between state latitude and skin cancer mortality?

# State LAT MORT1 Alabama 33.0 2192 Arizona 34.5 1603 Arkansas 35.0 1704 California 37.5 1825 Colorado 39.0 149 49 Wyoming 43.0 134

•Mortality rate of white males due to malignant skin melanoma from 1950-1959.

•LAT = degrees (north) latitude of center of state

•MORT = mortality rate due to malignant skin melanoma per 10 million people

Example 3: Relation between use of amphetamines and food consumption?

X = Amphetamine dose (mg/kg)

0 2.5 5.0 112.6 73.3 38.5 102.1 84.8 81.3 90.2 67.3 57.1 81.5 55.3 62.3 105.6 80.7 51.5 93.0 90.0 48.3 106.6 75.5 42.7 108.3 77.1 57.9

•24 rats randomly allocated to dose of amphetamine (saline (0), 2.5, and 5.0 mg/kg)

•Y = amount of food (grams of food consumed per kilogram of body weight) in following 3-hour period

Point estimates b0 and b1

n

ii

n

iii

xx

yyxxb

1

2

11 xbyb 10

The b0 and b1 values vary. They depend on the particular (xi, yi)sample obtained.

Assumptions about error terms i

• E(i) = 0 i and j are uncorrelated

• Var(i) = 2

• (New!!) i are normally distributed

NOTE: All results thus far (such as least squares estimates, Gauss-Markov Theorem, and mean square error) only dependon first three assumptions.Today’s results depend on normality of the error terms.

Sampling distribution of b1

b1 is normally distributed

Providing error terms i are normally distributed:

with mean 1

and variance

n

ii xx

1

2

2

Recall: Confidence interval for using

nstx

n )1(,2

Sample estimate ± margin of error

Confidence interval for 1 using b1

22,2

1

xx

MSEtbi

n


Recall: Hypothesis testing for using

The null (H0: = 0) versus the alternative (HA: ≠ 0)

ns

xt 0* Test statistic

P-value = How likely is it that we’d get a test statistic t* asextreme as we did if the null hypothesis is true?

The P-value is determined by comparing the test statistic t* to a t-distribution with n-1 degrees of freedom.

Hypothesis testing for 1 using b1

The null (H0: 1 = ) versus the alternative (HA: 1 ≠ )

2

1*

xxMSE

bt

i

Test statistic



Sampling distribution of b0

b0 is normally distributed

Providing error terms i are normally distributed:

with mean 0

and variance

n

ii xx

xn

1

2

22 1

Confidence interval for 0 using b0

2

2

2,20

1

xx

xn

MSEtbi

n


Hypothesis testing for 0 using b0

The null (H0: 0 = ) versus the alternative (HA: 0 ≠ )

2

20

1*

xx

xn

MSE

bt

i

Test statistic



180170160150140130120110

200

190

180

170

160

150

140

130

120

110

100

leg

punt

180170160150140130120110

190

180

170

160

150

140

130

120

110

100

leg

punt

S = 16.5841 R-Sq = 62.7 % R-Sq(adj) = 59.3 %

punt = 14.9065 + 0.902664 leg

Regression Plot

Example 1: Inference

The regression equation ispunt = 14.9 + 0.903 leg

Predictor Coef SE Coef T PConstant 14.91 31.37 0.48 0.644leg 0.9027 0.2101 4.30 0.001

S = 16.58 R-Sq = 62.7% R-Sq(adj) = 59.3%

Unusual ObservationsObs leg punt Fit SE Fit Residual St Resid 3 180 147.50 177.39 8.20 -29.89 -2.07R

R denotes an observation with a large standardized residual

504030

200

150

100

Latitude

Mor

talit

y

504030

200

150

100

Latitude

Mor

talit

yS = 19.1150 R-Sq = 68.0 % R-Sq(adj) = 67.3 %

Mortality = 389.189 - 5.97764 Latitude

Regression Plot

Example 2: InferenceThe regression equation isMortality = 389 - 5.98 Latitude

Predictor Coef SE Coef T PConstant 389.19 23.81 16.34 0.000Latitude -5.9776 0.5984 -9.99 0.000

S = 19.12 R-Sq = 68.0% R-Sq(adj) = 67.3%

Unusual ObservationsObs Latitude Mortality Fit SE Fit Residual St Resid 7 39.0 200.00 156.06 2.75 43.94 2.32R 9 28.0 197.00 221.82 7.42 -24.82 -1.41 X 30 35.0 141.00 179.97 3.85 -38.97 -2.08R

R denotes an observation with a large standardized residualX denotes an observation whose X value gives it large influence.

543210

115

105

95

85

75

65

55

45

35

dose

cons

umpt

ion

543210

110

100

90

80

70

60

50

40

dose

cons

umpt

ion

S = 11.4015 R-Sq = 73.9 % R-Sq(adj) = 72.8 %

consumption = 99.3313 - 9.0075 dose

Regression Plot

Example 3: InferenceThe regression equation isconsumption = 99.3 - 9.01 dose

Predictor Coef SE Coef T PConstant 99.331 3.680 26.99 0.000dose -9.008 1.140 -7.90 0.000

S = 11.40 R-Sq = 73.9% R-Sq(adj) = 72.8%

Unusual ObservationsObs dose consumpt Fit SE Fit Residual St Resid 18 5.00 81.30 54.29 3.68 27.01 2.50R

R denotes an observation with a large standardized residual.

Inference for 0 and 1 in Minitab

• Select Stat.• Select Regression.• Select Regression …• Specify Response (y) and Predictor (x).• Click on OK.

inference for 0 and 1 confidence intervals and hypothesis tests

Documents