1 nonparametric methods (非参数统计) chapter 15. nonparametric methods 15.1the sign test: a...

39
1 Nonparametric Methods 非非非 Chapter 15

Upload: beverly-charles

Post on 16-Jan-2016

263 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1

Nonparametric Methods(非参数统计)

Chapter 15

Page 2: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Nonparametric Methods

15.1The Sign Test: A Hypothesis Test about the Median(符号检验)

15.2 The Wilcoxon Rank Sum Test ( Wilcoxon 符号和检验)

Page 3: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.1 Nonparametric Tests ( 非参数检验 )

A. One-Sample Mean Test

Many tests are concern with testing some parameter under a certain distribution.

Test under a normal population if is known, the Z-test is recommended, where is the sample mean

and n is the sample size.

0100 :H vs:H ),(N 2

2

n/

XZ 0

X

Page 4: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.1 Nonparametric Tests

B. Two-Sample Mean Tests

Test under two respective normal populations

and . If a t-test is suggested.

211210 :H vs:H

),(N 211

),(N 222

unknown are 21

Page 5: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

In most cases the variances are unknown.

Comparing Means of Two Populationsunknown are 21

2

21

2

212121

~11

)()(

nn

p

t

nns

μμXXt

2 population from taken sample theof size

2 population from taken sample theof variance

2 population from taken sample theofmean

1 population from taken sample theof size

1 population from taken sample theof variance

1 population from taken sample theofmean

)1()1(

)1()1( variancepooled where

2

22

2

1

21

1

21

222

2112

n

s

X

n

s

X

nn

snsns p

Page 6: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.1 Nonparametric Tests

If the data are not normal distributed, the distribution of the t-statistic is unknown and depends the distribution of the populations.

There are a huge amount of underlying distributions.

Can we have some tests that are distribution free? The nonparametric test is one of such kinds of tests.

Page 7: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

A local pizza restaurant located close to a college campus advertises that their delivery time to a college dormitory is less than for a local branch of a national pizza chain.

In order to determine whether this advertisement is valid, you and some friends have decided to order 10 pizzas from the local pizza restaurant and 10 pizzas from the national chain, all at different times. The delivery times in minutes PIZZATIME are shown.

Example 1.1 Delivery times

Page 8: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Testing for the difference in the mean delivery times

Local Chain16.8 18.1 22.0 19.511.7 14.1 15.2 17.015.6 21.8 18.7 19.516.7 13.9 15.6 16.517.5 20.8 20.8 24.0

Example 1.1 Delivery times

Page 9: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

We can use t-test for this comparison if the delivery times are normal distributed.

Since the distribution of delivery times is not normal distributed, we might have difficulty to use the t-test.

Example 1.1 Delivery times

Page 10: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Example 1.1 Delivery times

We can consider the following way to compare these two restaurants

Local 16.8 11.7 15.6 16.7 15.7 18.1 14.1 21.8 13.9 20.8

Chain 22.0 15.2 18.7 15.6 20.8 19.5 17.0 19.5 16.5 24.0

result + + + - + + + - + +

Page 11: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

If two restaurants have the same level of the delivery time, there is a half chance for “+” and another half for “-”.

The number of “+”, denoted by T, follows the binomial distribution with p=0.5.

The number of “-” also follows the binomial distribution with p=0.5.

T=8 in this example.

1.2 Sign Test (符号检验 )

Page 12: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Review: Binomial Distribution

A. Bernoulli trials A trial with only two outcomes (yes or no,

success or fail, boy or girl, win or loss, 1 or 0) and related probabilities p and 1-p, is called a Bernoulli trial.

B. Several Bernoulli trials Let X be the number of success in n

independently identical Bernoulli trials . Random variable is said to follow a binomial distribution B(n;p).

Page 13: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Review: Binomial Distribution

C. Binomial probability distribution (二项概率分布 )

The probability of X=k is given by

)10( sample in the successes ofnumber

failure ofy probabilit1

success ofy probabilit

nsobservatio ofnumber

and given successes ofy probabilit where

)1()!(!

!)(

,n,,kk

-p

p

n

pnkk)P(X

ppknk

nkXP knk

Page 14: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.2 Sign Test: Example 1.1

5.0p:H vs5.0p:H 10

One tailed test

0.05510)X(P)9X(P)8X(P)8X(P

055.0)8X(Pvalue p

Binomial Test

1.00 8 .80 .50 .109.00 2 .20

10 1.00

Group 1Group 2Total

resultCategory N

ObservedProp. Test Prop.

Exact Sig.(2-tailed)

SPSS result:

Page 15: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.2 Sign Test: Example 1.1

5.0p:H vs5.0p:H 10

Two tailed test

0.05510)X(P)9X(P)8X(P)8X(P

11.0)8X(2Pvalue p

Binomial Test

1.00 8 .80 .50 .109.00 2 .20

10 1.00

Group 1Group 2Total

resultCategory N

ObservedProp. Test Prop.

Exact Sig.(2-tailed)

SPSS result:

Page 16: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

An Italian restaurant, close to a college campus, contemplated a new recipe for the sauce used on its pizza. A random sample of eight students was chosen, and each was asked to rate on a scale from 1 to 10 the tastes of the original sauce and the propose new one. The scores of the tests comparison are:

Example 1.2: Product Preference

Page 17: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

We can’t use the t-test for this data as the score is not normal distributed.

The statistic T, the number of “+”, follows

B (7;0.5) as the score of case “G” is zero. This sample gives T=2 .

Example 1.2: Product Preference

Page 18: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Binomial Test

.00 5 .71 .50 .4531.00 2 .29

7 1.00

Group 1Group 2Total

VAR00005Category N

ObservedProp. Test Prop.

Exact Sig.(2-tailed)

1.2 Sign Test: Example 1.2

5.0p:H0 One tailed test

0.2266)2X(P)1X(P)0X(P)2X(Pvalue pSPSS result:

5.0p:H1

There is no overall tendency to prefer one product to the other

A majority prefer the new product (or fewer than 50% prefer the old product)

Page 19: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.2 Sign Test: Example 1.2

5.0p:H vs5.0p:H 10

Two tailed test

0.45320.22662)2X(2Pvaluep

SPSS result:

Also, note that

0.4532)5X(P)2X(Pvaluep

Binomial Test

.00 5 .71 .50 .4531.00 2 .29

7 1.00

Group 1Group 2Total

VAR00005Category N

ObservedProp. Test Prop.

Exact Sig.(2-tailed)

Page 20: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Review: Binomial Distribution

C. Properties of the binomial distribution The expectation of B(n;p) is The variance of B(n;p) is The standard deviation of B(n;p) is

np

)1( pnp )1( pnp

Page 21: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

D. Normal Approximation (Section 6.4 of the book) )p;n(B~X

)p1(np

npa-

)p1(np

npb

)p1(np

npbZ

)p1(np

npaP

)p1(np

npb

)p1(np

npX

)p1(np

npaP)bXa(P

Review: Binomial Distribution

where is the distribution function of )( )1,0(N

Page 22: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Example 1.3 Customer Sales

(Example 6.8, p. 213)

A saleswoman makes initial telephone contact with potential customers in an effort to assess whether a follow-up visit to their homes is likely to be worthwhile. Her experience suggests that 40% of the initial contacts lead to follow-up visit. If she contacts 100 people by telephone, what is the probability that between 45 and 50 home visits will result?

Page 23: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Solution to Example 1.3: Customer Sales

Solution Let X be the number of follow-up visits. Then X has a binomial distribution with n=100 and p=0.40. Approximating the required probability gives

0.1332

8461.09793.0

(1.02)-(2.04)

)04.2Z02.1(P

)6.0)(4.0)((100

)4.0)((10050Z

)6.0)(4.0)((100

)4.0)((10045P)50X45(P

This probability is shown as an area under the standard normal curve below.

Page 24: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Solution to Example 1.3: Customer Sales

Number of Successes

Page 25: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

The continuity correction

Since the binomial distribution is discrete and the normal distribution is continuous, it is common practice to use continuity correction in the approximation:

Return to Example 1.3

)p1(np

np0.5-a-

)p1(np

np0.5b)bXa(P

0.1587

8208.09795.0

(0.92)-(2.14)

)6.0)(4.0)((100

)4.0)((1000.5-45

)6.0)(4.0)((100

)4.0)((1005.050)50X45(P

Page 26: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.2 Sign test: normal approximation n5.0)p1(np ,5n.0np

The approximation test-statistic

n5.0

5n.0TTz

**

where corrected for continuity defined as follows:

*T

a. For a two-tail test

T if ,5.0T

T if ,5.0TT*

c. For an lower tail test 5.0TT*

b. For an upper tail test 5.0TT*

Page 27: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Example 1.4 Ice Cream

Solution:

Use the normal approximation equations:

Page 28: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Example 1.4 Ice Cream

5.40T ,4840 since 53.1899.4

485.40TZ *

*

Binomial Test

56.00 56 .58 .50 .125a

40.00 40 .4296 1.00

Group 1Group 2Total

VAR00002Category N

ObservedProp. Test Prop.

Asymp. Sig.(2-tailed)

Based on Z Approximation.a.

The SPSS output:

126.00630.02value p

Page 29: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.3 Sign test for single population median

Example 1.5

Solution:

The dean of the School of Business Administration at a particular university would like information about the starting incomes of recent college graduates. A random sample of 23 recent graduates indicated the following starting salaries:29250 29900 28070 31400 31100 29000 33000 50000 28500 3100034800 42100 33200 36000 65800 34000 29900 32000 31500 29900

32890 36000 35000

Do the data indicate that the median starting income differs from $35000?

35000$Median:H VS 35000$Median:H 10

Page 30: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Since the distribution of incomes is often skewed, the sign test is recommended. There is a half chance that the income is greater than $35,000 if the hypothesis is true. Let T be the number of the income > $35,000.

N=23-1=22 as one data=$35,000. T=17

Solution to Example 1.5

35.2345.2

115.0TZ

11220.55n.0np

345.2225.0

0188.00094.02value p

Page 31: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

SPSS output to Example 1.5

Binomial Test

<= 35000 17 .77 .50 .017> 35000 5 .23

22 1.00

Group 1Group 2Total

VAR00001Category N

ObservedProp. Test Prop.

Exact Sig.(2-tailed)

Page 32: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.4 Wilcoxon Rank Sum Test

Two population identical test

Take a sample of size from the first population, and a sample of size from the second population,

We Want to test

1n

2n)x(F1

)x(F2

211210 FF:H vsFF:H

Page 33: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.4 Wilcoxon Rank Sum Test

The sign test does not use all the information from the data set.

The sign test for the delivery time in Example 1.1 ignores the time length. The Wilcoxon rank sum test provides a method to incorporate information about the magnitude of the differences between two populations.

Page 34: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

1.4 Wilcoxon Rank Sum Test

Two samples are pooled and sorted them in ascending order.

Let T denote the sum of the ranks of the observations from the first population.

Page 35: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Wilcoxon Rank Sum Test: Example 1.1

Sort the Local data 11.7, 13.9, 14.1, 15.6, 16.7, 16.8, 17.5, 18.1,

20,8, 21.8 Sort the Chain data 15.2, 15.6, 16.5, 17.0, 18.7, 19.5, 19.5, 20,8,

22.0, 24.0 Sort the mixed dataRank 1 2 3 4 5 6 7 8 9 10

Local 11.7

13.9

14.1

15.6 16.7

16.8

Chain 15.2

15.6 16.5

17.0

Rank 11 12 13 14 15 16 17 18 19 20Local 17.

518.1

20.8 21.8

Chain 18.7

19.5

19.5

20.8 22.0

24.0

Page 36: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Wilcoxon Rank Sum Test: Example 1.1

Sum of the rank

Test-statistic

Normal approximation

861816.51211985.5321Tlocal

86TT local

105,2

)11010(10)T(E

175

12

)11010(1010)T(Var

4363,.-1175

10586Z

0.1510.07552valuep

Page 37: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

SPSS output to Example 1.1

Ranks

10 8.60 86.0010 12.40 124.0020

grouplocalchainTotal

timeN Mean Rank Sum of Ranks

Test Statistics b

31.00086.000-1.438

.150

.165a

Mann-Whitney UWilcoxon WZAsymp. Sig. (2-tailed)Exact Sig.[2*(1-tailed Sig.)]

time

Not corrected for ties.a.

Grouping Variable: groupb.

Page 38: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Example 1.6

Page 39: 1 Nonparametric Methods (非参数统计) Chapter 15. Nonparametric Methods 15.1The Sign Test: A Hypothesis Test about the Median (符号检验)The Sign Test: A Hypothesis

Example 1.6

64402

)18080(80)T(E

8586712

)18080(8080)T(Var

89.285867

64407287Z

0.00380.00192valuep

Solution: