sampling distributions hypothesis...

50
SAMPLING DISTRIBUTIONS & HYPOTHESIS TESTING Day 4 Summer 2015 7/29/2015 1 Fang Chen陈芳华东师大英语

Upload: nguyenbao

Post on 31-Mar-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

SAMPLING DISTRIBUTIONS & HYPOTHESIS TESTING

Day 4Summer 2015

7/29/2015

1

Fang Chen

陈芳华东师大英语

FINDING THE PERCENTILE RANK OF ARAW SCORE

Step 1: Change the raw scores to z-scores using

Step 2: Look in the z-table to find the percentile rank.

Example A population mean of 400, with a population

SD of 100, What are the percentile rank corresponding to the following raw scores? What do they mean?1) A score of 5002) A score of 3003) A score of 275

7/29/2015

2

σµ−

=Xz Fang C

hen ECN

U 陈

芳华东师大英语系

LET ME JUST BE REDUNDANT… Percentile rank refers to the percentage of scores

at or below the score of interest.

There are no negative z values in the table. If the z value you calculated is positive, look

for the number under larger portion column. If the z value is negative, look for the

number under the smaller portion column.

7/29/2015

3

Fang Chen EC

NU

陈芳

华东师大英语系

FINDING THE RAW SCORE FROM APERCENTILE RANK

Step 1: Using the z-table, find the corresponding z-scores.

Step 2: transform the z scores back to the raw scores using

Example: We know a distribution has a mean of 400 and a SD

of 100, what raw score corresponds to the1) 95th percentile? 2) 50th percentile ? 3) 33th percentile?

7/29/2015

4

µσ += *ZX

Fang Chen EC

NU

陈芳

华东师大英语系

WHAT ELSE?A population mean of 400, with a population SD of

100 We can also answer more complex questions like

1) What percent of scores are between 300 and 540?2) What percent of scores are between 475 an 605?

Step 1: Transform the raw scores into z-scores. 300:z=-1, 540:z=1.4, 475: z=0.75, 605: z=2.05

Step 2: Find the proportion corresponding to the raw scores.

Step 3: Calculate the difference between the raw scores either by addition or subtraction.

7/29/2015

5

Fang Chen EC

NU

陈芳

华东师大英语系

7/29/2015

6

2) For a z-score of -1, this is the mean to z area:

Fang Chen EC

NU

陈芳

华东师大英语系

7/29/2015

7

For a z-score of 1.4, this is the mean to z area:

Fang Chen EC

NU

陈芳

华东师大英语系

7/29/2015

8

We can add the mean to z areas to calculate the percentage of scores falling in the range:p(-1 < z< 1.4) = p(-1 < z < μ) + p(μ < z< 1.4)

Fang Chen EC

NU

陈芳

华东师大英语系

7/29/2015

9

3) We can subtract the two areas as necessary.p(0.75< z < 2.05) = p( 0<z < 2.05) - p(0< z< 0.75)

Fang Chen EC

NU

陈芳

华东师大英语系

HOW ELSE COULD WE USE THIS? Given our conversation about probability in the

last class: we might want to describe how unusual a particular

score might be in the population. Used for hypothesis testing. Activity.

7/29/2015

10

Fang Chen EC

NU

陈芳

华东师大英语系

SUMMARY

PDF is introduced to get to probability for continuous variable.

How to transform any scores within a distribution into a z score ( or to standardize the raw scores)?

How to find the percentile of a z score? --- The portions of scores fall at or below the z score of interest.

How to find the raw scores that corresponds to a certain percentile?

How to find the percentage of scores fall within any two raw scores?

7/29/2015

11

Fang Chen EC

NU

陈芳

华东师大英语系

HYPOTHESIS TESTING

7/29/2015Fang C

hen陈芳华东师大英语

12

Z SCORE AND HYPOTHESIS TESTING

We spend such time on z-score transformation and finding percentiles. We are doing this not just for fun. In reality, the same procedure is used for testing a hypothesis.

7/29/2015

13

Fang Chen

陈芳华东师大英语

Example We know that the mean rate of finger tapping of

normal healthy adults is 100 taps in 20 seconds, with a standard deviation of 20, and that tapping speeds are normally distributed in the population. Assuming further that we know that the tapping rate is slower among people with certain neurological problems. Finally, suppose that an individual has just been sent to us who taps at a rate of 70 taps in 20 seconds. Is his score sufficiently below the mean for us to assume that he did not come from a population of neurologically healthy people?

We test this by doing the same thing as in last class. But now we are going to do it following a formal hypothesis testing procedure.

7/29/2015

14

Fang Chen

陈芳华东师大英语

LET’S START FROM THE BEGINNING

We wanted to test the research hypothesis that this person does not come from the neurologically healthy population.

Step 1: We set up our null hypothesis 虚无假设 / 零假设 that this

person comes from the healthy population group which has a mean tapping rate of 100 taps per 20 seconds and a SD of 20. The numeric expression of this sentence is:

H0: µ0=100Step2:

We set up our alternative hypothesis 备择假设 that this person comes from a different population with neurological problems whose mean is lower than 100. Or numerically:

H1:µ1<100.

7/29/2015

15

Fang Chen

陈芳华东师大英语

STEPS CONTINUED

Step3: We identify the population µ and σ, that is mean=100

and SD=20.

Step4: Calculate the probability of getting a value at or

lower than 70 from the healthy population .

percentile rank=0.0668,

7/29/2015

16

5.120

10070−=

−=

−=

σµXz

Fang Chen

陈芳华东师大英语

STEPS CONTINUED

Step5: Decide on your criterion. Conventionally, we use

p=0.01,p=0.05, or p=0.1. These are not percentile ranks, but areas. These are called rejection level 临界区域 or significance level 显著水平 of the test. The first two are more conservative. We will use p=0.05.

0.0668>0.05, Conclusion :

we fail to reject the null hypothesis or 70 is not an extreme value for a person who comes from a

healthy population.

7/29/2015

17

Fang Chen

陈芳华东师大英语

STEPS CONTINUED

Step6: Interpret the results: We have no reason to believe

that this person does not come from a healthy population.

Caution: We did not say we prove this person is healthy but that we have insufficient reason to conclude that he is not.

7/29/2015

18

Fang Chen

陈芳华东师大英语

A WORD ON “PROOF” We never say we ACCEPT the null hypothesis.

We say we fail to reject the null hypothesis. The logic of hypothesis testing comes from Karl

Popper’s principle of falsification 证伪原则

In essence, he says we can’t prove anything to be true – the best we can do is show something to be so unusual that it couldn’t have happened by chance. 证明某事是错误的比证明某事是正确的要容易得多。

E.g.

7/29/2015

19

Fang Chen

陈芳华东师大英语

LOGIC FOR DECISION AND INTERPRETATION

A statistical procedure tests the null hypothesis (H0)

This means we can do one of two things based on the results of our test: Reject the null hypothesis Fail to reject the null hypothesis

We are attempting to gather evidence that will allow us to falsify (reject) our null hypothesis—that is to say that the person does not come from a healthy population. His tapping rate is REALY slow as a reflection of an unhealthy person but not due to chance by a healthy person.

7/29/2015

20

Fang Chen

陈芳华东师大英语

This is another way to say our evidence supports our alternative hypothesis. Or that the person comes from a population with neurological problems.

In the tapping example, we fail to falsify our null hypothesis or we don’t have statistically significant evidence to support our alternative hypothesis.

7/29/2015

21

Fang Chen

陈芳华东师大英语

SOUNDS TOO WORDY?

Take-home point: If we reject null hypothesis, we claim

our alternative hypothesis.

If we fail to reject null hypothesis, we DID NOT PROVE the null hypothesis, we just fail to defy it.

7/29/2015

22

Fang Chen

陈芳华东师大英语

GUIDELINES ON REJECTION CRITERION

Rejection level/ significance level: The p value Rejection region 临界区域: The area, represented

by the p value above Critical z value: the z score corresponding to the

criterion p value In education, p=0.01, p=0.05 and p=0.1 are all

used, although the middle one is most common.

7/29/2015

23

Fang Chen

陈芳华东师大英语

7/29/2015

24

Here is the z-score we observed

Our critical value is -1.65. This is the z score where 5% of the scores will be at or below the z score of -1.65 or a raw score of 67. The p value is 0.05, corresponding to 5% of the area at the left tail.

Fang Chen

陈芳华东师大英语

REAL BUSINESS

Sampling distribution and hypothesis testing build the foundation for us to learn about How to frame a research question What kinds of things to think about How to evaluate the answer

We have just tasted a small research. After this chapter, we will learn the specific logical and computational approaches to carry out REAL research: statistical techniques and their applications

Before we can do that, we are going to make one more transition --- to sampling distribution.

7/29/2015

25

Fang Chen

陈芳华东师大英语

CASE I AND II

7/29/2015Fang C

hen陈芳华东师大

26

Population Behavior Problem Scores

µ=50, SD=10

15 68 48 58 50 53 42 50 56 47 57 57 43 60 50 36 48 45 41 66 43 53 39 33 49 41 56 57 47 45 47 55 49 47 40 54 40 41 48 45 68 47 53 34 56 44 67 43 31 58 50 66 46 55 55 47 56 56 39 64 57 62 43 47 31 33 48 39 63 40 68 56 56 41 44 54 51 45 65 69 48 44 54 51 40 42 75 33 55 52 47 47 64 55 44 60 49 56 45

66

66=x

Population Behavior Problem Scores

µ=50, SD=10

15 68 48 58 50 53 42 50 56 47 57 57 43 60 50 36 48 45 41 66 43 53 39 33 49 41 56 57 47 45 47 55 49 47 40 54 40 41 48 45 68 47 53 34 56 44 67 43 31 58 50 66 46 55 55 47 56 56 39 64 57 62 43 47 31 33 48 39 63 40 68 56 56 41 44 54 51 45 65 69 48 44 54 51 40 42 75 33 55 52 47 47 64 55 44 60 49 56 45

66

Sample

63 53 57 53 31 69 68 48 45 55

54=X

CASE1: TESTING HYPOTHESES FORONE OBSERVED SCORE WHEN SIGMAIS KNOWN

σµ−

=Xz

7/29/2015

27

When we have a single score, we can compare it to the entire population by calculating its z-score, then using the table to determine the likelihood of just randomly obtaining a score above or below the one you’ve got:

Fang Chen

陈芳华东师大

CASE2: TESTING HYPOTHESES BASEDON A SAMPLE WHEN SIGMA IS KNOWN

2X

X X Xz

nn

µ µ µσσ σ

− − −= = =

7/29/2015

28

We use the same approach to test our mean —the central limit theorem tells us how to adjust our formula for the z-score

This is standard deviation of the

sampling distribution

The CLT tell us the variance of the sampling

distribution. We just take the square root of this!

Fang Chen

陈芳华东师大

This is the mean of the (one) sample

we have.

SAMPLING DISTRIBUTION

7/29/2015Fang C

hen陈芳华东师大英语

29

TERMINOLOGY ABOUT SAMPLINGDISTRIBUTION

Sample statistics vs. population parameters Sampling distribution 样本分布: the distribution of

the sample statistics from the same population We already have a population. We focus on the sample mean for now (other

statistics include median, mode, variance, etc.)thus the distribution of the sample meanS (sampling

distribution of mean)

So when we talk about sampling distribution, we are NOT talking about ONE sample BUT a set of samples from the same population

7/29/2015

30

Fang Chen

陈芳华东师大英语

SAMPLE SIZEN=10

7/29/2015

31

Population Behavior Problem Scores

µ=50, SD=10

15 68 48 58 50 53 42 50 56 47 57 57 43 60 50 36 48 45 41 66 43 53 39 33 49 41 56 57 47 45 47 55 49 47 40 54 40 41 48 45 68 47 53 34 56 44 67 43 31 58 50 66 46 55 55 47 56 56 39 64 57 62 43 47 31 33 48 39 63 40 68 56 56 41 44 54 51 45 65 69 48 44 54 51 40 42 75 33 55 52 47

47 64 55 44 60 49 56 45 66

µ=50

Sampling distribution: {54, 49, 51, 47, 44}

495

4447514954. =++++

=X

Sample 1

63 53 57 53 31 69 68 48 45 55

541 =X

Sample 2

47 36 39 33 60 48 54 54 66 54

492 =X

Sample 3

47 56 41 50 57 56 55 58 44 47

513 =X Sample 444 36 57 56 48 45 50 45 42 48

474 =X

Sample 549 33 49 45 51 39 40 36 43 56

445 =X

81.34

)4944...()4954(

1.)(

22

2

=−+−

=

−−

=

∑n

XXSE i

Fang Chen

陈芳华东师大英语

SAMPLING DISTRIBUTION OF THE MEAN

What does a distribution of sample means look like? Most sample means are near the population

mean. Values that are not near the population mean

are rare. With a fair number of samples (15 or more),

the distribution of means looks normal.We can use this to make judgments on how

likely it is that a sample mean is extreme based on random sampling (sampling error)

7/29/2015

32

Fang Chen

陈芳华东师大英语

TERMINOLOGY ABOUT SAMPLING ERROR

Sampling error: It is not a “mistake”. It is a statistical term to refer to

the variability of the sample means due to the process of sampling.

It is the sample mean variance.

Standard error: Again, not “mistake”. It is the standard deviation of the sample means. Abbreviated as SE.

7/29/2015

33

Fang Chen

陈芳华东师大英语

WHAT DOES STANDARD ERROR TELL US? Standard Error ( ): the standard deviation of

the sample distribution A large standard error indicates we have a lot of

sampling error—that is, our sample statistic values can be very different from sample to sample just because of random sampling The size of the sample will directly affect this—why? The variance in the population will affect this—why?

7/29/2015

34

Fang Chen

陈芳华东师大英语

CENTRAL LIMIT THEOREM 中心极限理论

Given a population with mean µ and variance σ2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to µ, and a variance equal to σ2/N . The distribution will approach the normal distribution as N, the sample size, increases.

7/29/2015

35N

N

X

X

X

/

/22

σσ

σσ

µµ

=

=

=

Fang Chen

陈芳华东师大英语

SEEING THE SAMPLING DISTRIBUTION

http://www.socr.ucla.edu/Applets.dir/SamplingDistributionApplet.html

7/29/2015

36

Fang Chen

陈芳华东师大英语

CONCLUSIONS

The larger the sample size (N), the smaller the SE.

The more samples we have, the closer the sampling distribution approaches normality and this is true regardless whether the population distribution is normal or skewed.

Not seen from the simulation but obvious from the central limit theorem: The larger the population variance, the larger the

SE.

7/29/2015

37

Fang Chen

陈芳华东师大英语

HYPOTHESIS TESTING

We do not go around obtaining sampling distribution simply because they are interesting to look at.

Again, we want to test some hypothesis. E.g. We have a sample of highly stressed children

with a mean behavior problem score of 56. My hypothesis is that highly-stressed children have more behavior problems than normal population.

We can compare this mean (56) to the population mean (50) to see whether this sample mean is possible due to sampling error. If the difference between 50 and 56 is due to the sampling

error---highly stressed children do not have more behavior problems.

If the difference is not due to sampling error/chance---highly stressed children have more behavior problems than normal children.

7/29/2015

38

Fang Chen

陈芳华东师大英语

HOW CAN WE TELL?--- PROBABILITY

7/29/2015

39

Fang Chen

陈芳华东师大英语

WHERE DOES 56 FALL? The central limit theorem says:

Given a population with mean µ and variance σ2, the distribution of sample means will have a mean equal to µ, and a variance equal to σ2/N . The sampling distribution will approach the normal distribution as N, the sample size, increases.

Population mean µ= 50, and SD=10. Thus the sampling distribution of the mean should have a mean and a SE

z transformation percentile rank=___________

7/29/2015

40

50=Xµ 47.45/10/ === NX σσ

34.147.4

5056=

−=

−=

−=

X

XXXzσµ

σµ

Fang Chen

陈芳华东师大英语

HOW ABOUT A MEAN OF 62 ?

z transformation:

Percentile rank=____________

7/29/2015

41

50=Xµ 47.4=Xσ

68.247.4

5062=

−=

−=

−=

X

XXXzσµ

σµ

Fang Chen

陈芳华东师大英语

INTO THE RESEARCH

Remember what we said about sampling distributions of means: overall, most random samples will fall close to the mean—that is, it is probable that a sample mean will be close to the population mean

It is improbable that a sample mean will be very far off

Based on this, we can start doing real research.

We now formally move from descriptive statistics to inferential statistics.

7/29/2015

42

Fang Chen

陈芳华东师大英语

LET’S START FROM THE BEGINNING---AGAIN

We wanted to test the research hypothesis that children under the stress of divorce are more likely than normal children to exhibit behavior problems.

Step 1: We set up our null hypothesis that our sample of children

under the stress of divorce come from a population whose mean equals 50. The numeric expression of this sentence is:

H0: µ0=50Step2:

We set up our alternative hypothesis that our sample of children under the stress come from a different population whose mean is higher than 50. Or numerically:

H1:µ1>50.

7/29/2015

43

Fang Chen

陈芳华东师大英语

STEPS CONTINUED

Step3: We obtain the sampling distribution of the mean

under the assumption that null hypothesis/H0 is true using central limit theorem. That is, we get the

Step4: Calculate the probability of getting a mean at or

higher than 56 from a population (of means) with mean of 50 and SD=4.47.

percentile rank=0.9099, p=0.0901

7/29/2015

44

47.45/1050 ==== XX and σµµ

34.147.4

5056=

−=

−=

X

XXzσµ

Fang Chen

陈芳华东师大英语

STEPS CONTINUED

Step5: Decide on your criterion. Here we use p=0.05 Make the decision to reject or fail to reject H0. 0.0901>0.05, fail to reject the null hypothesis. Conclude our sample of children under stress come

from the same population with a behavior problem mean score of 50 and SD=10.

Step6: Interpret the results: On average, children under the

stress have similar amount of behavior problems as the population. More specifically, they do not have more behavior problems than the population.

7/29/2015

45

Fang Chen

陈芳华东师大英语

PRACTICE7/29/2015

46

We will use 0.05 as our criterion just to follow the tradition.

Our critical p value is ________

Our critical z value (z-score) is _________

Fang Chen

陈芳华东师大英语

7/29/2015

47

1.65

We can think about this in two ways:a. Any z-score bigger than our critical value will cause us to reject the

null hypothesis. b. Any p value smaller than 0.05 will cause us to reject the null

hypothesis.What decisions should you make if…z=1.5, z=2, p=.03, p=0.08

Fang Chen

陈芳华东师大英语

SUMMARY

How to use z score and probability to test a hypothesis

Sampling distribution is the distribution of the samples.

Sampling distribution of mean follows the central limit theorem

We use the central limit theorem to get back to the same way of testing a z score and hypothesis testing. The only difference is that we are now testing a sample (a group of scores, eg. The group of children under stress) rather than one score (eg.the person with a tapping speed of 70 per 20 seconds whom is suspected to be neurologically unhealthy).

7/29/2015

48

Fang Chen

陈芳华东师大英语

PREPARING FOR NEXT CLASS

We have assumed that we know the population mean. In reality, we don’t.

In that case, we just look up in a new table: the t-score, rather than the z-score table.

7/29/2015

49

Fang Chen

陈芳华东师大

ACTIVITY

7/29/2015

50

Fang Chen

陈芳华东师大英语