lecture 8: more hypothesis testingfaculty.nps.edu/rdfricke/business_stats/lecture8.pdf · 3 the...

38
1 Business Statistics Lecture 8: More Hypothesis Testing

Upload: phamdang

Post on 16-Mar-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

1

Business Statistics

Lecture 8: More Hypothesis

Testing

Page 2: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

Goals for this Lecture

• Review of t-tests

• Additional hypothesis tests

• Two-sample tests

• Paired tests

2

Page 3: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

3

The Basic Idea of Hypothesis Testing

• Start with a theory or hypothesis

• For example, m = 814.3

• Collect some data

• Ask: How unusual is it to see this data if the null hypothesis is true?

• If it’s unusual, reject the null hypothesis

• If not, fail to reject the null

• Remember, determine the hypothesis to be tested before looking before looking at the data

Page 4: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

4

It All Ties Back to the Empirical Rule

• If we hypothesize that the data come from a N(0,1)

distribution, how unusual an observation must we see to

reject our hypothesis?

It depends on the alternative hypothesis…

-4 -3 -2 -1 0 1 2 3 4

Z

68%

95%

Page 5: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

5

For Example, a Two-sided Test

-4 -3 -2 -1 0 1 2 3 4

Z

68%

95%

Null: The mean is equal to zero (H0: m = 0)

Alternative: The mean is not equal to zero (Ha: m ≠ 0)

If the rejection criterion is p-value < 0.05, we reject if our

observation is greater than 1.96 or less than -1.96:

Page 6: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

6

In JMP

• JMP computes the probability of seeing

data as extreme or more extreme under

various alternate hypotheses

• You have to choose the appropriate p-value

• Then compare the JMP p-value to 0.05

• Smaller: reject the null

• Larger: fail to reject the null

• Output is in terms of rescaled “t-scores”

• Using t distribution comes from using s to

estimate s

Page 7: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

Conducting the Test in JMP

• With one continuous variable, Analyze >

Distribution > red triangle > Test Mean

• Type in the mean to be tested (“Specify

Hypothesized Mean”)

• If population (“true”) standard deviation

known, enter it

• This will be a z-test

• If you leave it blank, JMP does a t-test

• It uses s to estimate s

7

Page 8: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

8

Back to the Paint Case (primer.jmp)

• A More Complicated Question:

• Suppose we are less interested in the value of 1.2 and more interested in whether processes “a” and “b” have the same mean

• Null hypothesis

• Means are the same: ma- mb = 0

• Alternative hypothesis

• Means are different: ma- mb 0

Page 9: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

9

Solution: Two-sample t-test

Process “a”Process “b”

X Y

Mean = mx

SD = sx

Mean = my

SD = sy

• Two sample t-test assumes Xs

and Ys are independent

X1, X2, …, XnY1, Y2, …, Ym

Random Samples

Page 10: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

10

• What do you think the test statistic is?

• How should we rescale the test statistic?

• What does the p-value represent?

Results of Two Sample t-test

Page 11: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

• Null Hypothesis: mx- my = 0

• Test Statistic:

• Fact: since and are independent:

• So

X Y

11

YX

)()()( YVarXVarYXVar

mn

yx

22 ss

22

( )yxSE X Y

n m

ss

Two-sample t-test

Page 12: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

• Test statistic:

• Estimated standard error:

• Rescaled test statistic:

12

X Y

22yx

x y

ss

n n

22

0

yx

x y

X Yt

ss

n n

Rescaled Test Statistic

Page 13: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

13

• For some test statistic T where m and s

are not known, compute

where

• m * is the hypothesized true value

• sT is the sample standard error of the

statistic T

Remember: Rescaling

*

T

Tt

s

m

Page 14: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

14

• In a one-sample test of, choose m*

• Then T = , so the test statistic is

• In a two-sample test, you’re often

testing whether the means are equal

• T = , and the test statistic is

One-sample and Two-sample Tests

* *

. .( ) . .( )

T Xt

s d T s e X

m m

* ( ) 0 ( )

. .( ) . .( ) . .( )

T X Y X Yt

s d T s e X Y s e X Y

m

X

YX

Page 15: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

• We must estimate sx and sy

• If sx = sy then we can get a better

estimate

• Remember: Sample variance for a

single sample:

15

n

j j xxn

s1

22 )(1

1

Sample mean

Deviations from sample meanAverage squared deviation

from the mean

Equal Variances?

Page 16: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

16

• Remember, SD is

calculated using

differences from

the mean

• Each group can

have very different

mean but standard

deviations can be

similar

Different Means But Similar SD

-3

-2

-1

0

1

2

3

4

5

6

Page 17: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

• Pooled estimate of sample variance:

17

2 2

1 12( ) ( )

( 1) ( 1)

n m

j jj j

p

x x y ys

n m

Sample mean for process a

Sample mean for process b

Used two degrees of freedom, n+m-2 left over

• Pooled estimate buys you more df

• Weighted average of and 2

xs 2

ys

Average squared deviation from different means

More Bang for the Buck

Page 18: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

Conducting the Test in JMP

• Need two variables: one continuous and one

categorical (denoting group)

• Then: Analyze > Fit Y by X (continuous

variable is the Y and categorical the X) > red

triangle > Means/Anova/Pooled t

• See the “t Test” part of the output

18

Page 19: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

Case: Taste Testing Teas

19

• Small taste test of teas (taste.jmp)

• 16 panelists in a focus group

• Each tasted two formulations of a

prepackaged iced tea

• Rated them on a scale of 1 (excellent) to 7

(really bad)

• Company wants to know if there is a

difference in ratings between the two

formulations

Page 20: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

20

• Two-sample t-test on taste.jmp:

• Is there a

significant

difference?

An Initial Evaluation

Page 21: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

21

Taste Case: Any Difference?

• Unless SD’s vastly different (factor of 2), the

equal variance assumption no big deal

Page 22: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

22

Independence Assumption

Very Important

• Independence assumption for two

sample t-test is violated

• Good news: there is an alternate test

that can do even better

• Paired t-test assumes two observations

taken for each unit in the sample

• Observations on the same unit likely to be

more similar than obs’ns on different units

• Here same person tasted each formulation

Page 23: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

23

Paired t-test Looks at Differences

x1-y1=d1

x2-y2=d2• .

xn-yn=dn

• Calculate differences for

each observation

• Calculate sample mean and

SD of differences

• Do a one sample t-test for

differences:

• H0: mean difference is zero

• Ha: mean difference is not 0

Page 24: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

24

Paired t-test in JMP

• Use Analyze >

Matched Pairs

• Two variables,

paired by row:

Page 25: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

25

Results: Paired t-test in JMP

Mean Difference is same as two sample test

SE is smaller –why??

Page 26: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

26

• Heuristic:

• When xj and yj “vary together” then yj will

be big when xj is big

• Since xj & yj tend to be close together, xj-yj

is smaller than when X and Y independent

Why Pairing Helps

• Math:

• When and are not independent thenX Y

( ) ( ) ( ) 2 ( , )Var X Y Var X Var Y Cov X Y

• Cov or “covariance” measures linear

dependence between two variables

Page 27: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

27

It Helps in this Case Because…

• People first have a like or dislike for tea

• Their ratings of the formulations are relative to

this overall opinion of tea

• Taking the difference removes the “person

effect”

0

1

2

3

4

5

6

7

Taste

1

0 1 2 3 4 5 6 7 8

Taste 2

Tend to

dislike tea

Tend to

like tea

Page 28: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

28

-3

-2

-1

0

1

2

3

4

-3 -2 -1 0 1 2 3

X

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3

X

• xj-yj is horizontal distance to the y=x line

• xj-yj is smaller (typically) in the right hand plot

Independence vs. Dependence

Page 29: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

29

Case: Sales Force Comparison

• Newly merged pharmaceutical company

(PharmSal.jmp)

• Two sales forces (“BW” & “GL”), one from

each of the merged companies

• 20 sales districts are the same

• Sales reps divided into these districts

• Sell essentially the same drugs

• Management wants to know if one sales

force outperforms the other

Page 30: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

30

Sales by Division

Sa

les

100

150

200

250

300

350

400

450

500

550

BW GL

Division

BW

GL

Level

112

119

Minimum

151.1

151.6

10%

215.25

197.75

25%

291

313.5

Median

385.5

409.75

75%

428.5

460.6

90%

525

547

Maximum

Quantiles

Page 31: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

31

Two-Sample t-test ResultsS

ale

s

100

150

200

250

300

350

400

450

500

550

BW GL

Division

• Under the independence

assumption, we conclude

that there is no difference

in the means

• But are they

independent?

Page 32: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

32

The Sales Forces Are Dependent

• Dependence occurs by sales district:

Page 33: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

33

Paired t-test Comparison

• Which

sales force

is doing

better?

Page 34: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

34

More Complicated Tests

• There are even more complicated tests

you can do

• E.G., test for equal variance

• You’re never going to remember all the

steps for each test anyway

• Let the computer do it for you

Page 35: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

35

Terminology

• One-sided vs. two-sided

• Comes from the statement of the alternative hypothesis

• Are you calculating the p-value using one tail or two?

• One-sample vs. two-sample

• Comes from the type of data and the question you are answering

• Are you testing a mean or a difference between means?

Page 36: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

36

Which Test?

• How many populations are sampled?• One: one-sample test

• Two: read on

• Are observations in first sample independent of observations in second sample?• Yes: two-sample t-test

• No: paired t-test

• Big Clue:• Paired t-test needs two observations from each

unit• Unequal sample sizes 2 sample test

• Equal sample sizes you have to decide

Page 37: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

37

Hypothesis Tests in the Computer Age

• Know the null and alternative

hypotheses

• Have some idea of what test statistics

you would look at

• Let the computer figure out how to

rescale them

• Let the computer figure out the p-value

• p-values are always interpreted the

same way

Page 38: Lecture 8: More Hypothesis Testingfaculty.nps.edu/rdfricke/Business_Stats/lecture8.pdf · 3 The Basic Idea of Hypothesis Testing •Start with a theory or hypothesis •For example,

38

What we have learned so far…

• Descriptive Statistics

• Probability

• Inference for a population mean

• Confidence intervals

• Hypothesis testing

• One-sample test of the mean

• Two-sample tests

• Paired tests