stat 512 – lecture 16 two quantitative variables (ch. 9)
Post on 21-Dec-2015
219 views
TRANSCRIPT
Stat 512 – Lecture 16
Two Quantitative Variables (Ch. 9)
Last Time
With one quantitative response and one qualitative explanatory, can use one-way ANOVA to compare the population/true treatment means
This procedure is easily extendable to any number of qualitative explanatory variables EV 1: Adoptive SES H0 : Adopt High = Adopt Low
EV 2: Biological SES H0: Bio High = Bio Low Two-way ANOVA, General Linear Model
Two-way ANOVA
The “effect” of the SES of the adoptive parents is statistically significant (p-value = .010) even after adjusting for the stronger “effect” of the SES of the biological parents (p-value = .001)
Last Time
When have multiple explanatory variable (“factors”), can also consider the interaction between these variables
Last Time
When have “paired” or “dependent” samples, the blocking variable can be incorporated into the model as well
Example: Chip melting times H0: BMCSSHa: at least one differs Compare SSMCB xvsxvsx ..
n1=11
n2=11
n3=17
“completely randomized design”
37 subjects
butterscotch
milk chocolate
semi-sweet
Compare melting times
random
“randomized block design”
13 subjects
b-mc-ssb-ss-mc
mc-b-ss
Compare melting times
random
ss-b-mc ss-mc-b
mc-ss-b
Last Time
“Repeated measures” analyses are just like taking the differences first in “paired samples.”
If you want to compare results within blocks or within subjects (instead of across and ignoring the pairing), include that variable in the ANOVA
Practice Problem
With random assignment to distinct groups (milked by machine or human), will consider independent
With any correspondence, relationship between units, will consider dependent Litter mates “Split plots” Both calculators on sample problem
Example 12.6: Positive and Negative Influences on Children (p. 463) “Children are exposed to many influences in
their daily lives. What kind of influence does each of the following have on children? 1. Movies, 2. Programs on network television, 3. Rock music” -2=very negative, -1=negative, 0=neutral,
1=positive, 2= very positive Research question: Are the population mean
responses identical for the three influences? H0: TVRHa: at least one differs
Example 12.6: Positive and Negative Influences on Children
Influence
Subject Movies TV Rock
1 -1 0 1
2 1 0 0
3 0 1 -2
4 2 0 1
5 0 -1 -1
6 -2 -2 -2
7 -1 -1 0
8 0 1 -1
9 -1 -1 -1
10 1 0 1
11 1 1 -1
12 -1 -1 -2
Example 12.6: Positive and Negative Influences on Children
While different people do seem to tend to give significantly different ratings (p-value = .003), once we adjust for that, we do not have super convincing evidence of an “influence effect” (p-value = .101).
Example 1: Airline Costs
Best prediction of cost? Sample mean, 295.2
Another explanatory variable Distance
Describing the association between two quantitative variables
Moderate, positive, linear relationship
Describing the association between two quantitative variables Which is stronger?
r = .444 r = -.265
Describing the association between two quantitative variables
Moderate, positive, linear relationship r = .439
Modeling the relationship
How decide on the best line
Residual = observed - predicted
Example 2: Height vs. Foot Length Least Squares Regression applet The “least squares line” finds the equation for the line
that minimizes the sum of the squared residuals Trying to minimize “prediction errors” Using squared residuals means there will be a unique
equation that does this
sizefootheight 033.1302.38^
Example 2: Height vs. Foot Length Interpretation of slope: For each additional
cm in height, we predict an additional 1.033 inches taller
Interpretation of intercept: If someone has 0cm foot, predict 38.302 inches tall! Not always meaningful in every context!
“Resistance”
Remove or change point and see if line changes dramatically
The least squares line is not resistant to extreme observations Especially those that are extreme in the
explanatory variable (often a stronger determinant than the size of the residual)
R2
If predict everyone to have the same height, lots of “unexplained” variation (SSE = 475.75)
If take explanatory variable into account, much less “unexplained” variation (SSE = 235)
Example 1: Airline costs
Each flight has a ‘set up’ cost of $151 and each additional mile of travels is associated with an predicted increase in cost of about 7 cents.
19.3% of the variability in airfare is explained by this regression on distance (still lots of unexplained variability)
Might investigate further while the cost for ACK was so much higher than expected
mileagecost 0734.151^
For Thursday
PP 14 in Blackboard by 3 pm Finishing up HW 7 Continue reading in Ch. 9
By next Tuesday – another project report Narrowed in on 2 “research questions” and which
statistical methods you think will answer them…