chi-square analyses

47
Chi-Square Analyses

Upload: audrey-farmer

Post on 17-Jan-2018

221 views

Category:

Documents


0 download

DESCRIPTION

Outline of Today’s Discussion Please refrain from typing, surfing or printing during our conversation!  Outline of Today’s Discussion The Chi-Square Test of Independence – Introduction The Chi-Square Test of Independence – Excel The Chi-Square Test of Independence – SPSS The Chi-Square Test for Goodness of Fit - Introduction The Chi-Square Test for Goodness of Fit – Excel

TRANSCRIPT

Page 1: Chi-Square Analyses

Chi-Square Analyses

Page 2: Chi-Square Analyses

Outline of Today’s Discussion

1. The Chi-Square Test of Independence – Introduction

2. The Chi-Square Test of Independence – Excel

3. The Chi-Square Test of Independence – SPSS

4. The Chi-Square Test for Goodness of Fit - Introduction

5. The Chi-Square Test for Goodness of Fit – Excel

Please refrain from typing, surfing or printing during our conversation!

Page 3: Chi-Square Analyses

Part 1

Chi-Square Test of Independence(Introduction)

Page 4: Chi-Square Analyses

Chi-Square: Independence1. The chi-square is a non-parametric test - It’s NOT

based on a mean, and it does not require that the data are bell-shaped (I.e., Gaussian distributed).

2. We can use the Chi-square test for analyzing data from certain between-subjects designs. Will someone remind us about between-subject versus within-subject designs?

3. Chi-square tests are appropriate for the analysis of categorical data (i.e., on a nominal scale).

Page 5: Chi-Square Analyses

Chi-Square: Independence1. Sometimes a behavior can be described only in an

all-or-none manner.

2. Example: Maybe a particular behavior was either observed or not observed.

3. Example: Maybe a participant either completed an assigned task, or did not.

4. Example: Maybe a participant either solved a designated problem, or did not.

Page 6: Chi-Square Analyses

Chi-Square: Independence1. We can use the Chi-square to test data sets

that simply reflect how frequently a particular category of behavior is observed.

2. The Chi-square test of independence is also called the two-way chi-square.

3. The Chi-square test of independence requires that two variables are assessed for each participant.

Page 7: Chi-Square Analyses

Chi-Square: Independence1. The Chi-square test is based on a comparison

between values that are observed (O), and values that would be expected (E) if the null hypothesis were true.

2. The null hypothesis would state that there is no relationship between the two variables, i.e., that the two variables are independent of each other.

3. The chi-square test allows us to determine if we should reject or retain the null hypothesis…

Page 8: Chi-Square Analyses

Chi-Square: Independence1. To calculate the chi-square statistic, we need to

develop a so-called “contingency table”.

2. In the contingency table, the levels of one variable are displayed across rows, and the levels of the other variable are displayed across columns.

3. Let’s see a simple 2 x 2 design…

Page 9: Chi-Square Analyses

Chi-Square: IndependenceContingency Table: 2 rows by 2 columns

The “marginal frequencies” are therow totals and column totals for each

level of a particular variable.

Democrat RepublicanPolitical Party

City

Minneapolis

Atlanta

Page 10: Chi-Square Analyses

Chi-Square: Independence1. A “cell” in the table is defined as a

unique combination of variables (e.g., city, political party).

2. For each cell in the contingency table, we need to calculate the expected frequency.

3. To get the expected frequency for a cell, we use the following formula…

Page 11: Chi-Square Analyses

Chi-Square: IndependenceThe expected (E) frequency of a cell.

Example

Page 12: Chi-Square Analyses

Chi-Square: Independence

Does everyone now understand where this 28

came from?

Democrat RepublicanPolitical Party

City

Minneapolis

Atlanta

Page 13: Chi-Square Analyses

Chi-Square: Independence

Here’s the Chi-square statistic.Let’s define the components…

Page 14: Chi-Square Analyses

Chi-Square: Independence

Components of the Chi-Square Statistic

Page 15: Chi-Square Analyses

Chi-Square: Independence

We’ll need one of thesefor each cell in our contingency table.

Then, we’ll sum those up!

Page 16: Chi-Square Analyses

Chi-Square: Independence

Check: Be sure to have one of thesefor each cell in your contingency table.

We’ll reduce them, then sum them…

Page 17: Chi-Square Analyses

Chi-Square: Independence

Finally, for each cell,reduce the parenthetical expression

to a single number, and sum those up.

Page 18: Chi-Square Analyses

Chi-Square: Independence1. After calculating the Chi-square statistic, we

need to compare it to a “critical value” to determine whether to reject or accept the null hypothesis.

2. The critical value depends on the alpha level. What does the alpha level indicate, again?

3. The critical value also depends on the “degrees of freedom”, which is directly related to the number of levels in being tested…

Page 19: Chi-Square Analyses

Chi-Square: IndependenceFormula for the “degrees of freedom”

In our example, we have 2 rows and 2 columns, so

df = (2-1) (2-1)df = 1

Page 20: Chi-Square Analyses

Chi-Square: Independence1. We will soon attempt to develop some intuitions about the

“degrees of freedom” (df), and why they are important.

2. For now, we will simply compute the df so that we can determine the critical value.

3. For df = 1, and an alpha level of 0.05, what is the critical value? (see the hand-out showing the critical values table).

4. How does the critical value compare to the value of chi-square that we obtained (i.e., 6.43)?

5. So, what do we decide about the null hypothesis?

Page 21: Chi-Square Analyses

Chi-Square: Independence1. Congratulations! You’ve completed your first try at

hypothesis testing!

2. In a way, the computations are somewhat similar to the various “r” statistics you’ve previously calculated.

3. However, we had not previously compared our “r” statistics to a critical value. So, we had not previously drawn any conclusions about statistical significance.

4. Questions so far?

Page 22: Chi-Square Analyses

Chi-Square: Independence1. Before we move on, I’d like you to develop some

intuitions about the computations…

2. Let’s look at a portion of the computation that you just completed, and really understand it…

Page 23: Chi-Square Analyses

Chi-Square: Independence

Under what circumstances wouldthe expression that’s circled

produce a zero?

Page 24: Chi-Square Analyses

Chi-Square: Independence

In general, when the observed and expectedvalues are very similar to each other,the chi-square statistic will be small

(and we’ll likely retain the null hypothesis).

Page 25: Chi-Square Analyses

Chi-Square: Independence

By contrast, when the observed and expectedvalues are very different from each other,

the chi-square statistic will be large(and we’ll likely reject the null hypothesis).

Page 26: Chi-Square Analyses

Chi-Square: Independence1. The decision to reject or retain the null hypothesis

depends, of course, not only on the chi-square value that we obtain, but also on the critical value.

2. Look at the critical values on the Chi-square table that was handed out. What patterns do you see, and why do those patterns occur?

3. Questions or comments?

Page 27: Chi-Square Analyses

Part 2

Chi-Square Test of Independence

In Excel

Page 28: Chi-Square Analyses

Part 3

Chi-Square Test of IndependenceIn SPSS

Page 29: Chi-Square Analyses

Chi-Square in SPSS1. Here’s the sequence of steps for Chi-Square in SPSS.

2. Analyze --> Descriptive Statistics --> CrossTabs (yeah, it’s weird).

3. Select the two variables of interest by moving one into the ROWS box, and the other into COLUMNS box.

4. Statistics --> check off the chi-square

5. Cell display --> check off observed, expected, row & column

6. In the output, look for a large value of Pearson Chi square, we need “asymp sig (2 sided)” to be < 0.05, our alpha level.

Page 30: Chi-Square Analyses

Chi-Square in SPSS

When the “asymp sig (2 sided)” value is < 0.05, reject the null hypothesis.

In practice, there are 2 alpha levels:There’s the criterion alpha level (usually 0.05),

and the observed alpha level (shown in SPSS output)

Page 31: Chi-Square Analyses

Chi-Square in SPSS1. For a given degree-of-freedom level, there is an

inverse relationship between the observed chi-square statistic and observed alpha level.

2. The higher the observed chi-square value, the smaller the observed alpha level, i.e., “sig” value.

Prob

abili

ty b

y C

hanc

e

Page 32: Chi-Square Analyses

Chi-Square in SPSS

Prob

abili

ty b

y C

hanc

e

Large chi-square values are unlikely to occur just by chance,So….large chi-square values correspond to low alpha levels.

(Note: Alpha levels are called “sig” values in SPSS)

There is a low probability of large 2 values.

Page 33: Chi-Square Analyses

Part 4

Chi-Square Test:Goodness-of-Fit

Page 34: Chi-Square Analyses

Chi-Square: Goodness-of-Fit

1. Good news! The test for the goodness-of-fit is much simpler than that for independence! :-)

2. In the test for goodness-of-fit, each participant is categorized on ONLY ONE VARIABLE.

3. In the test for independence, participants were categorized on two different variables (i.e, city and political party)).

Page 35: Chi-Square Analyses

Chi-Square: Goodness-of-Fit1. The null hypothesis states that the expected frequencies

will provide a “good fit” to the observed frequencies.

2. The expected frequencies depend on what the null hypothesis specifies about the population…

3. For example, the null hypothesis might state that all levels of the variable under investigation are equally likely in the population.

4. Example: Let’s consider the factors that go in to choosing a course for next semester…

Page 36: Chi-Square Analyses

Chi-Square: Goodness-of-Fit1. Perhaps we’re identified 4 factors affecting course

selection: Time, Instructor, Interest, Ease.

2. The null hypothesis might indicate that, in the population of Denison students, these four factors are equally likely to affect course selection.

3. If we sample 80 Denison students, the expected value for each category would be (80 / 4 categories = 20).

4. Questions so far?

Page 37: Chi-Square Analyses

Chi-Square: Goodness-of-Fit1. Let’s further assume that, after asking students to decide

which of the 4 factors most affects their course selection, we obtain the following observed frequencies.

2. Time = 30; Instructor=10; Interest=22; Ease=18.

3. We now have the observed and expected values for all levels being examined…

Page 38: Chi-Square Analyses

Chi-Square: Goodness-of-Fit

The chi-square computation is simplerthan before, since we only have one variable

(i.e., only one row).

Page 39: Chi-Square Analyses

Chi-Square: Goodness-of-Fit

Calculating the degrees of freedomis also simpler than before.

(C = # of columns = one for each level of the variable)

df = C - 1

Page 40: Chi-Square Analyses

Chi-Square: Goodness-of-Fit1. Let’s now evaluate the null hypothesis using the chi-square

test for goodness-of-fit. Again the observed frequencies are…

2. Time = 30; Instructor=10; Interest=22; Ease=18.

3. The expected frequencies are 20 for each category (because the null hypothesis specifies that the four factors are equally likely to effect course selection in the population).

Page 41: Chi-Square Analyses

Chi-Square: Goodness-of-Fit1. Note: There are two assumptions that underlie both chi-

square tests.

2. First, each participant can contribute ONLY ONE response to the observed frequencies.

3. Second, each expected frequency must be at least 10 in the 2x2 case, or in the single variable case;

each expected frequency must be at least 5 for designs that are 3x2 or higher.

Page 42: Chi-Square Analyses

Chi-Square: Goodness-of-Fit1. Lastly, there are standards by which the chi-square statistics are to be reported, formally.

2. Statistics like this are to be reported in the Method section of an APA style report.

3. APA = American Psychological Association

p = 0.033

Page 43: Chi-Square Analyses

Chi-Square: Goodness-of-FitMemorize This!

All APA-style manuscripts consist of the following sections,

in this order:

• Abstract• Introduction• Method (singular, not not Methods)• Results• Discussion• References

Page 44: Chi-Square Analyses

Part 5

Chi-Square Test forGoodness-of-Fit

In Excel

Page 45: Chi-Square Analyses

Goodness of Fit in Excel1. We’ve already seen how we can use a Chi-square table to find

the critical value (“the number to beat”).

2. We can also use the following Excel command:=chiinv( probability, degrees of freedom)where probability = criterion alpha leveli.e., 0.05 in most cases.

3. The output of “=Chiinv()” is the critical value “the number to beat”

Page 46: Chi-Square Analyses

Goodness of Fit in Excel1. We can also use Excel to find the observed alpha level, given

an observed 2 value.

2. Here’ the Excel command: =chidist(2 , degrees of freedom)

3. The output of “=Chidist()” is the observed alpha level (“sig value in SPSS”).

4. The observed alpha level must be less than 0.05 (or the criterion alpha level) to reject the null hypothesis.

Page 47: Chi-Square Analyses

Goodness of Fit in Excel1. We can also use Excel to find the observed alpha level, given

an observed 2 value.

2. Here’ the Excel command: =chidist(2 , degrees of freedom)

3. The output of “=Chidist()” is the observed alpha level (“sig value in SPSS”).

4. The observed alpha level must be less than 0.05 (or the criterion alpha level) to reject the null hypothesis.