余红梅 department of health statistics school of public health, shanxi medical university...
TRANSCRIPT
余红梅Department of Health Statistics
School of Public Health, Shanxi Medical University
卫生统计学 Health Statistics
第九章 检验( II ) chi-square test ( II )
2
Vocabulary of chapter 9
Chi-square test 卡方检验Statistics 统计学Statistic 统计量Statistician 统计学家Population rate 总体率Sample rate 样本率Pooled rate 合并率Sample size 样本含量Sampling error 抽样误差Hypothesis testing 假设检验
Vocabulary of chapter 9
Null hypothesis 无效假设(零假设)
Alternative hypothesis 备择假设Significance level 检验水准Table of critical value 界值表P value P 值Fourfold table 四格表Actual (observed) frequency 实际 ( 观察 ) 频数Theoretical (expected) frequency 理论 ( 期望 ) 频数Row total 行合计Column total 列合计
Vocabulary of chapter 9
Association 关联Independence 独立Categorical variable 分类变量Distribution 分布Goodness of fit test 拟合优度检验General formula 基本公式Specific formula 专用公式Continuity correction 连续性校正Completely randomized design 完全随机设计Paired design 配对设计
Vocabulary of chapter 9
Positive 阳性Negative 阴性Concordant (agreement) 一致Disconcordant (disagreement) 不一致Equivalent to 等价于McNemar’s test McNemar 检验Proportion 构成比
chi-square test for 2×2 table
两种注射方式接种疫苗不良反应发生率分组 有不良反应 无不良反应 合计 反应率(%)肌内 35 74 109 32.11
皮下 22 71 93 23.66
合计 57 145 202 28.22
Question: Are the 2 response rates equal?
1. The type of the experimental design
chi-square test for 2×2 table
2. Collection of data
3. Sorting data: 2×2 table
Success Failure Total
Sample 1 a b a+b
Sample 2 c d c+d
Total a+c b+d n
chi-square test for 2×2 table
4. Analysis of data
:0H
:1H 21 21
05.0
chi-square test for 2×2 table
To apply chi-square test, the sample size should be large enough.
Experience: n≥40 and all T≥5
T
TA 22 )(
)1)(1( CR
))()()((
)( 22
dbcadcba
nbcad
n
nnT CRRC
chi-square test for 2×2 table
If H0 is true, A and T should be close each other and chi-square statistic will tend to be small.
If H0 is not true, chi-square statistic will tend to be large.
chi-square test for 2×2 table
If , then reject H0 。 There is statistically significant difference between 2 sample rates. Or the difference between 2 sample rates is statistically significant.
Otherwise, no reason to reject H0 。 There is no statistically significant difference between 2 sample rates.
2,05.0
2
chi-square test for 2×2 table
n≥40 but any 1≤T< 5
Yates correction (continuity correction)
T
TA 2
2)5.0(
))()()((
)2
( 2
2
dbcadcba
nn
bcad
chi-square test for paired 2×2 table
1. Design
Paired design.
Each food sample has to be detected by
method A and method B.
2. Collection of data
chi-square test for paired 2×2 table
3. Sorting data: paired 2×2 table
Outcomes of method A and method B
Method A Method B Total
+ - + 160 (a) 32 (b) 192 (a+b)
- 9 (c) 48 (d) 57 (c+d)
Total 169 (a+c) 80 (b+d) 249 (n)
chi-square test for paired 2×2 table
4. Analysis of data
Purpose 1: testing for the difference between 2 methods. Which is better for high positive rate?
Note: The 2 samples are not independent. The above chi-square test does not work.
chi-square test for paired 2×2 table
H0: B=C
H1: B≠C
A1=b A2=c
If H0 is true, T1=T2=(b+c)/2
For large sample (b+c≥40):
05.0
cb
cbcb
cbc
cb
cbb
2
22
2
2
2
2
2
chi-square test for paired 2×2 table
If b+c < 40, chi-square needs correction.
For example 9-3, b+c=32+9=41>40,
cb
cbcb
cbc
cb
cbb
2
22
2 1
2
5.02
2
5.02
90.12
932
932 222
cb
cb
1 005.0P
chi-square test for paired 2×2 table
Conclusion: reject H0. There is statistically significant difference in positive rates of 2 methods.
Since pA (77.11%) > pB (67.87%),
method A is better.
This test is called McNemar’s test.
chi-square test for paired 2×2 table
Purpose 2: testing for the association between 2 methods.
H0: method A and method B are independent
H1: method A and method B are associated
Question: if H0 is true, how much is the expected frequency of each cell?
05.0
chi-square test for paired 2×2 table
概率乘法定理:互相独立事件同时出现的概率等于各事件单独出现时概率的乘积。
249
169
249
192)()()( BpApBAp
249
169192
249
169
249
192249)()(
BAnpBAT
249
80
249
192)()()( BpApBAp
249
80192
249
80
249
192249)()(
BAnpBAT
chi-square test for paired 2×2 table
Chi-square statistic and degree of freedom are both same as those of section 1. However, the design and purpose of study as well as the explanation of results are still different.
chi-square test for paired 2×2 table
For example 9-3,
The outcomes of 2 methods are not independent of one another.Or there is association between the outcomes of 2 methods.
95.918016957192
249)93248160( 22
1 005.0P
chi-square test for paired 2×2 table
关联的方向: ad-bc > 0: 正相关 ad-bc < 0: 负相关
关联的程度: Pearson 列联系数:
Cramer 列联系数(修正)
5193.095.91249
95.91
pC
6077.0249
95.91cC
练习题:用两种方法检查已确诊的乳腺癌患者 120
名,甲法检出率为 60 %,乙法检出率为50% ,甲乙两法一致的检出率为 35 %,
问两种方法检出率有无差别?两种方法有无关联?
chi-square test for R×C table
R×C table: R numbers of rows
C numbers of columns
T
TA 22 )(
1
22
CRnn
An
)1)(1( CR
chi-square test for R×C table
Comparison of more than 2 sample rates Comparison of 2 or more than 2 sample
proportions Association analysis of 2 categorical
variables Note: There is no order among
categories of each variable.
Cautions:
When more than 2 groups are compared, H0 is rejected only means there is difference among some groups. It dose not necessarily mean that all the groups are different.
chi-square test requires large sample. By experience, the T should be at least 5 in more than 4/5 cells, and T in any cell should be greater than 1. Otherwise, we cannot use chi-square test directly.
For ordinal data comparison, see chapter 10.
Summary Chi-square test for 2 independent sample rates
n≥40 and all T≥5, no need of correction. n≥40 but any 1≤T< 5, correction is needed.
Chi-square test for paired 2×2 table Testing for the difference between 2 methods Testing for association between 2 methods
Chi-square test for R×C table Testing for the difference among more rates,
proportions, or testing for the association between 2 categorical variables (no order).