stat 512 – lecture 16 two quantitative variables (ch. 9)

Stat 512 – Lecture 16

Two Quantitative Variables (Ch. 9)

Last Time

With one quantitative response and one qualitative explanatory, can use one-way ANOVA to compare the population/true treatment means

This procedure is easily extendable to any number of qualitative explanatory variables EV 1: Adoptive SES H0 : Adopt High = Adopt Low

EV 2: Biological SES H0: Bio High = Bio Low Two-way ANOVA, General Linear Model

Two-way ANOVA

The “effect” of the SES of the adoptive parents is statistically significant (p-value = .010) even after adjusting for the stronger “effect” of the SES of the biological parents (p-value = .001)

Last Time

When have multiple explanatory variable (“factors”), can also consider the interaction between these variables

Last Time

When have “paired” or “dependent” samples, the blocking variable can be incorporated into the model as well

Example: Chip melting times H0: BMCSSHa: at least one differs Compare SSMCB xvsxvsx ..

n1=11

n2=11

n3=17

“completely randomized design”

37 subjects

butterscotch

milk chocolate

semi-sweet

Compare melting times

random

“randomized block design”

13 subjects

b-mc-ssb-ss-mc

mc-b-ss

Compare melting times

random

ss-b-mc ss-mc-b

mc-ss-b

Last Time

“Repeated measures” analyses are just like taking the differences first in “paired samples.”

If you want to compare results within blocks or within subjects (instead of across and ignoring the pairing), include that variable in the ANOVA

Practice Problem

With random assignment to distinct groups (milked by machine or human), will consider independent

With any correspondence, relationship between units, will consider dependent Litter mates “Split plots” Both calculators on sample problem

Example 12.6: Positive and Negative Influences on Children (p. 463) “Children are exposed to many influences in

their daily lives. What kind of influence does each of the following have on children? 1. Movies, 2. Programs on network television, 3. Rock music” -2=very negative, -1=negative, 0=neutral,

1=positive, 2= very positive Research question: Are the population mean

responses identical for the three influences? H0: TVRHa: at least one differs

Example 12.6: Positive and Negative Influences on Children

Influence

Subject Movies TV Rock

1 -1 0 1

2 1 0 0

3 0 1 -2

4 2 0 1

5 0 -1 -1

6 -2 -2 -2

7 -1 -1 0

8 0 1 -1

9 -1 -1 -1

10 1 0 1

11 1 1 -1

12 -1 -1 -2

Example 12.6: Positive and Negative Influences on Children

While different people do seem to tend to give significantly different ratings (p-value = .003), once we adjust for that, we do not have super convincing evidence of an “influence effect” (p-value = .101).

Example 1: Airline Costs

Best prediction of cost? Sample mean, 295.2

Another explanatory variable Distance

Describing the association between two quantitative variables

Moderate, positive, linear relationship

Describing the association between two quantitative variables Which is stronger?

r = .444 r = -.265

Describing the association between two quantitative variables

Moderate, positive, linear relationship r = .439

Modeling the relationship

How decide on the best line

Residual = observed - predicted

Example 2: Height vs. Foot Length Least Squares Regression applet The “least squares line” finds the equation for the line

that minimizes the sum of the squared residuals Trying to minimize “prediction errors” Using squared residuals means there will be a unique

equation that does this

sizefootheight 033.1302.38^

Example 2: Height vs. Foot Length Interpretation of slope: For each additional

cm in height, we predict an additional 1.033 inches taller

Interpretation of intercept: If someone has 0cm foot, predict 38.302 inches tall! Not always meaningful in every context!

“Resistance”

Remove or change point and see if line changes dramatically

The least squares line is not resistant to extreme observations Especially those that are extreme in the

explanatory variable (often a stronger determinant than the size of the residual)

R2

If predict everyone to have the same height, lots of “unexplained” variation (SSE = 475.75)

If take explanatory variable into account, much less “unexplained” variation (SSE = 235)

Example 1: Airline costs

Each flight has a ‘set up’ cost of $151 and each additional mile of travels is associated with an predicted increase in cost of about 7 cents.

19.3% of the variability in airfare is explained by this regression on distance (still lots of unexplained variability)

Might investigate further while the cost for ACK was so much higher than expected

mileagecost 0734.151^

For Thursday

PP 14 in Blackboard by 3 pm Finishing up HW 7 Continue reading in Ch. 9

By next Tuesday – another project report Narrowed in on 2 “research questions” and which

statistical methods you think will answer them…

stat 512 – lecture 16 two quantitative variables (ch. 9)

Documents