lecture anova
TRANSCRIPT
-
7/24/2019 Lecture ANOVA
1/30
DESIGN OF EXPERIMENTS
ANALYSIS OF VARIANCE
-
7/24/2019 Lecture ANOVA
2/30
Normal Distribution
f(y) =
b
a
dyyf )(P(a y b) =
1)(
dyyf
m
Gaussian Equation:
s
Measurement:yi= + ei
-
7/24/2019 Lecture ANOVA
3/30
Estimation of Parameters
1
1estimates the population mean
n
i
i
y yn
m
2 2 2
1
1( ) estimates the variance
1
n
i
i
S y y
n
s
The average (mean):
The Sample Variance (Standard Deviation)2:
Measurement:yi= + ei
m
-
7/24/2019 Lecture ANOVA
4/30
The Standard Normal Distribution
One way to simplify calculating probabilities is to use aNormal Deviate (z)
Subtract mfrom allyobservations and the newobservations will have a mean of 0
Divide these new observations by sand the new standarddeviation becomes 1.
s
m
y
z
Result:f(z) = a normal standarddistribution
2exp2
1)(
2z
zf
-
7/24/2019 Lecture ANOVA
5/30
Areas under the standard normal distribution curve
Y rea Y rea Y rea Y rea Y rea Y rea Y rea Y rea
-4,00 0,0000 -3,00 0,0013 -2,00 0,0228 -1,00 0,1587 0,00 0,5000 1,00 0,8413 2,00 0,9772 3,00 0,9987
-3,99 0,0000 -2,99 0,0014 -1,99 0,0233 -0,99 0,1611 0,01 0,5040 1,01 0,8438 2,01 0,9778 3,01 0,9987
-3,98 0,0000 -2,98 0,0014 -1,98 0,0239 -0,98 0,1635 0,02 0,5080 1,02 0,8461 2,02 0,9783 3,02 0,9987
-3,97 0,0000 -2,97 0,0015 -1,97 0,0244 -0,97 0,1660 0,03 0,5120 1,03 0,8485 2,03 0,9788 3,03 0,9988-3,96 0,0000 -2,96 0,0015 -1,96 0,0250 -0,96 0,1685 0,04 0,5160 1,04 0,8508 2,04 0,9793 3,04 0,9988
-3,95 0,0000 -2,95 0,0016 -1,95 0,0256 -0,95 0,1711 0,05 0,5199 1,05 0,8531 2,05 0,9798 3,05 0,9989
-3,94 0,0000 -2,94 0,0016 -1,94 0,0262 -0,94 0,1736 0,06 0,5239 1,06 0,8554 2,06 0,9803 3,06 0,9989
-3,93 0,0000 -2,93 0,0017 -1,93 0,0268 -0,93 0,1762 0,07 0,5279 1,07 0,8577 2,07 0,9808 3,07 0,9989
-3,92 0,0000 -2,92 0,0018 -1,92 0,0274 -0,92 0,1788 0,08 0,5319 1,08 0,8599 2,08 0,9812 3,08 0,9990
-3,91 0,0000 -2,91 0,0018 -1,91 0,0281 -0,91 0,1814 0,09 0,5359 1,09 0,8621 2,09 0,9817 3,09 0,9990
Y
mz= 0
sz= 1
Y
dzzf )(
-
7/24/2019 Lecture ANOVA
6/30
Portland Cement Formulation
Observation
(Sample),j
Formula 1
y1,j
Formula 2
y2,j1 16.85 17.50
2 16.40 17.63
3 17.21 18.25
4 16.35 18.00
5 16.52 17.86
6 17.04 17.75
7 16.96 18.22
8 17.15 17.90
9 16.59 17.96
10 16.57 18.15
Average: 16.76 17.92
1
2
1
1
1
16.76
0.100
0.316
10
y
S
S
n
Summary Statistics
Formulation 1
2
2
2
2
2
17.920.061
0.247
10
yS
S
n
Formulation 2
yij= i+ ij (i=1,2, j=1-10)
-
7/24/2019 Lecture ANOVA
7/30
14 15 16 17 18 19 20 21
0.316
0.247
17.9216.76
Frequency
Observation
12
1
1
1
16.760.100
0.316
10
yS
S
n
22
2
2
2
17.920.061
0.247
10
yS
S
n
Cement Formulation Data
Do these formulations differ?
-
7/24/2019 Lecture ANOVA
8/30
The Two-Sample t-Test
t0follows a
t-distribution
with n1+n2-2degreesof freedom
21
210
11
nnS
yyt
p
kdegrees
of freedom
2)1()1(
21
2
22
2
112
nnSnSnSp
-
7/24/2019 Lecture ANOVA
9/30
The Two-Sample t-Test
CementFormulation:t0= -9.13
m1= m2is consideredto be a rare event if
| t0| is 2.101.
The P-value (3.6810-8)is the risk ofwrongly rejectingthe null hypothesis ofequal means
2.5%2.5%
t distributionwith 18 DOF
-
7/24/2019 Lecture ANOVA
10/30
What If There Are More
Than Two Factor Levels?
The t-test does not directly apply
There are lots of practical situations wherethere are either more than two levels ofinterest, or there are several factors ofsimultaneous interest
The analysis of variance(ANOVA) is the
appropriate analysis engine for these typesof experiments
Used extensively today for industrial
experiments
-
7/24/2019 Lecture ANOVA
11/30
New synthetic fiber to make cloth for shirts Response variable: tensile strength
Cotton content vary between 15 and 35%
Each experiments replicated 5 times
What is the best weight % of cotton to use?
Cotton Fiber Example
-
7/24/2019 Lecture ANOVA
12/30
Does changingthecotton weight
percent change the
mean tensilestrength?
Is there an optimumlevel for cottoncontent?
-
7/24/2019 Lecture ANOVA
13/30
In general, there will be alevelsof the factor, or a
treatments, and nreplicatesof the experiment, run inrandomorder, i.e.a completely randomized design
N= antotal runs Objectiveis to test for differences between the a means
-
7/24/2019 Lecture ANOVA
14/30
Models for the Data
Consider the normal (means) model:
yi j= i+ i j (mi: mean for the i:th treatment)
If we define:i= + t i
We will get an effects model:yi j= + t i+ i j
where t iis the i:th treatmenteffect
-
7/24/2019 Lecture ANOVA
15/30
The basic single-factor ANOVA model is
1,2,...,,1,2,...,
ij i iji ayj n
m t e
The name analysis of varianceis derived froma partitioning of the total variability in theresponse variable into its components parts
Analysis of Variance (ANOVA)
Treatments
Replications
-
7/24/2019 Lecture ANOVA
16/30
ANOVA1,2,...,
,1,2,...,
ij i ij
i ay
j nm t e
Definitions:
n
y
y
n
j
ij
i
1.
N
y
y
a
i
n
j
ij
1 1
..
Observation Mean: Overall Mean:
(within treatments) (all measurements)
-
7/24/2019 Lecture ANOVA
17/30
The Analysis of Variance
a measure of thetotal variability
Total Corrected Sum of Squares:
a
i
n
j
i jT yySS1 1
2
.. )(
The basic ANOVA partitioning is:
2
.
1 1
...
1 1
2
.. )]()[()( ii j
a
i
n
j
i
a
i
n
j
i jT yyyyyySS
TreatmentAverage
-
7/24/2019 Lecture ANOVA
18/30
The Analysis of Variance
2.
1 1
... )]()[( iij
a
i
n
j
i yyyy
))((2)()( .1 1...
2
.1 1
2
1 1... ii j
a
i
n
jiii j
a
i
n
j
a
i
n
jiT yyyyyyyySS
a
i
i yyn1
2
...)(
n
j
iij yy1
. 0)(
-
7/24/2019 Lecture ANOVA
19/30
The Analysis of Variance
E rrorsTreatmentsii j
a
i
n
j
a
i
iT SSSSyyyynSS
2
.
1 11
2
... )()(
-
7/24/2019 Lecture ANOVA
20/30
The Analysis of Variance
A large value of SSTreatments
reflects largedifferences in treatment means
A small value of SSTreatments likely indicates nodifferences in treatment means
Formal statistical hypotheses are:
T Treatments E SS SS SS
0 1 2
1
:
: At least one mean is different
aH
H
m m m
-
7/24/2019 Lecture ANOVA
21/30
The Analysis of Variance
While sums of squares cannot be directly compared totest the hypothesis of equal means, mean squarescan.
A mean square (MS) is a sum of squares divided by itsdegrees of freedom:
1
)(1
2
2
n
yy
S
n
i
i
Recall the equation for thesample variance:
1
)(
1
2
1
...
a
yyn
aSSMS
a
i
i
TreatmentsTreatments
aN
yy
aNSSMS
a
i
i
n
j
ij
Er ro rEr ro r
1
2
.
1
)(
-
7/24/2019 Lecture ANOVA
22/30
The Analysis of Variance
If the treatment means are equal, the treatment anderror mean squares will be (theoretically) equal.
1= 2= = a MSTreatment~ MSError
If treatment means differ, the treatment meansquare will be larger than the error mean square.
i k MSTreatment> MSError
If the error mean squares are larger than thetreatment mean squares, there is a PROBLEM
-
7/24/2019 Lecture ANOVA
23/30
The Analysis of Variance
Rejectthe null hypothesis(equal treatment means) if 0 , 1, ( 1)a a n
F F
It turns out that the ratio ofMSTreatmentsandMSErrorfollows the F distributionwith (a-1)and (N-a) degrees of freedom
Error
Treatments
MS
MS
F 0
So the test statisticfor the hypothesis of nodifference in mean is:
-
7/24/2019 Lecture ANOVA
24/30
Cotton Fiber Example
ANOVA
Source of
Variation SS df MS F P-value F crit
Between Groups 475.76 4 118.94 14.75682 9.13E-06 2.866081
Within Groups 161.2 20 8.06
Total 636.96 24
76.140
Error
Treatments
MS
MSF
-
7/24/2019 Lecture ANOVA
25/30
The F Distribution
F0 = 14.76 >> F0.01,4,20
F0.05,4,20= 2.87 (5% probability)
F0.01,4,20= 4.43 (1% probability)
With a-1 (5-1=4) and N-a (25-5=20)
Degrees of Freedom (F4,20)
-
7/24/2019 Lecture ANOVA
26/30
The Statistical EffectsModel1,2,...,
( ) 1,2,...,
1,2,...,ijk i j ij ijk
i a
y j b
k n
m t t e
ti= effect of treatment A
j= effect of treatment B
(t)ij= synergistic effect of
treatments A and B (interaction term)
MORE FACTORS?
-
7/24/2019 Lecture ANOVA
27/30
Extension of ANOVA to Factorials
2 2 2
... .. ... . . ...
1 1 1 1 1
2 2. .. . . ... .
1 1 1 1 1
( ) ( ) ( )
( ) ( )
a b n a b
ijk i j
i j k i j
a b a b n
ij i j ijk ij
i j i j k
y y bn y y an y y
n y y y y y y
Degrees of Freedom:
SST= abn-1
SSA= a-1
SSB= b-1
SSAB= (a-1)(b-1)
SSE= ab(n-1)
Can calculate mean
squares as before todetermine if the
variables have effect
-
7/24/2019 Lecture ANOVA
28/30
ANOVA Table Fixed Effects Case
Available computer softwares can
easily perform these calculations.
-
7/24/2019 Lecture ANOVA
29/30
Factorials with More Than
Two Factors
Basic procedure is similar to the two-factorcase; all a,b,c,,ktreatment combinationsare run in random order
ANOVA identity is also similar:
T A B AB AC
ABC AB K E
SS SS SS SS SS
SS SS SS
Computers are great, are they not?
-
7/24/2019 Lecture ANOVA
30/30
Minitab Statistical Software
Download 30-day trial version fromweb site:
http://www.minitab.com/
1) Click on Products tab
2) MiniTab 17: Click: Learn More
3) Click: try Minitab 17,
then 30-day Free trial
http://www.minitab.com/http://www.minitab.com/