Chapter 11분포와 도수분석
Chi-square distrsquon amp the analysis of frequencies
2
111 분포의 수리적 특징
bull 의 응용 (Usage)적합도 검정(Tests of Goodness-of-Fit)독립성 검정(Tests of Independence)동질성 검정(Tests of Homogeneity)
2
2
1
2 2
1
2
~ independent (01)
~
~ ( )
nn
i ni
i
Z Z N
Z
Yie g Z Y N
i
의 정의 (definition)
2
112 적합도 검정(Goodness-of- fit)
bull 우리의 data가 가설상의 분포(정규분포 이항분포 포아슨 분포 등)와 일치하는가
bull Data = theoretical distribution (normal binomial Poisson etc)
H0 정규분포를 따른다 vs H1 not H0
콜레스테롤 수치(mgdl) 대상자 수1-59 2
6-109 2
11-159 7
16-209 19
21-259 4
26-309 6
31-359 3
36-409 4
bull보기 1121(Normal distrsquon)
2
2 2
1
~k
i i
k ri i
O E
E
Oi E
i O Ei i
관측치(observed) 기대치(expected)
r 제약조건 ( )+추정하는 모수의 개수
restriction parameters estimated
interval expected rel freq expected freq
bull 119909 =35∙2+85∙2+⋯+385∙4
2+2+⋯+4= 2105319
bull 1199042 =35minus2105319 2∙2+ 85minus2105319 2∙2+⋯+ 385minus2105319 2∙4
2+2+⋯+4minus1
= 759482
119904 = 759482 = 871483
계급구간(interval)
표준화된 계급구간(standardized interval)
상대도수의 기대치(relative frequency)
기대도수(expected frequency)
lt1 001069 0502651~ 59 minus230104 003136 14740
6~109minus172731 008228 38672
11~159minus115357 015667 73637
16~209minus057984 021655 10178
21~259minus000610 021729 10213
26~309056763 015828 74393
31~359 114137 008370 3933736~409 171510 003212 15096
ge41 228884 001104 051909
19766
202877
P(Zltminus230104)
P(-230ltZltminus173)
P(172ltZlt229)
P(Zgt229)
119874119894 minus 1198641198942119864119894 =14762 gt qchisq(0055lowertail=F)= 11071
-gt Reject Ho data ~ normal
constraints ( 119864119894 = 119874119894 120583 = 119883 120590 = 119904)=3 -gt df=8minus3=5
계급구간 관측도수(119926119946) 기대도수(119916119946)119926119946 minus119916119946
120784119916119946
lt 1 0 050265 2760810-4
1-59 2 147406-109 2 38672
090156
11-159 7 73637001796
16-209 19 10178 7646621-259 4 10213 3779426-309 6 74393 02784831-359 3 39337 022162
36-4094 15096 19156
ge 41 0 051909Total 47 47 14762
4
197662
20287
보기 1122 이항분포 (binomial distrsquon)
H0 자료는 이항분포를 따른다 (적합도검정)
각 의사별 신약을 선호하는 환자의 수 의사의 수 환자의 수
0 5 0
1 6 6
2 8 16
3 10 30
4 10 40
5 15 75
6 17 102
7 10 70
8 10 80
9 9 81
10 이상 0 0
합 100 500
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
111 분포의 수리적 특징
bull 의 응용 (Usage)적합도 검정(Tests of Goodness-of-Fit)독립성 검정(Tests of Independence)동질성 검정(Tests of Homogeneity)
2
2
1
2 2
1
2
~ independent (01)
~
~ ( )
nn
i ni
i
Z Z N
Z
Yie g Z Y N
i
의 정의 (definition)
2
112 적합도 검정(Goodness-of- fit)
bull 우리의 data가 가설상의 분포(정규분포 이항분포 포아슨 분포 등)와 일치하는가
bull Data = theoretical distribution (normal binomial Poisson etc)
H0 정규분포를 따른다 vs H1 not H0
콜레스테롤 수치(mgdl) 대상자 수1-59 2
6-109 2
11-159 7
16-209 19
21-259 4
26-309 6
31-359 3
36-409 4
bull보기 1121(Normal distrsquon)
2
2 2
1
~k
i i
k ri i
O E
E
Oi E
i O Ei i
관측치(observed) 기대치(expected)
r 제약조건 ( )+추정하는 모수의 개수
restriction parameters estimated
interval expected rel freq expected freq
bull 119909 =35∙2+85∙2+⋯+385∙4
2+2+⋯+4= 2105319
bull 1199042 =35minus2105319 2∙2+ 85minus2105319 2∙2+⋯+ 385minus2105319 2∙4
2+2+⋯+4minus1
= 759482
119904 = 759482 = 871483
계급구간(interval)
표준화된 계급구간(standardized interval)
상대도수의 기대치(relative frequency)
기대도수(expected frequency)
lt1 001069 0502651~ 59 minus230104 003136 14740
6~109minus172731 008228 38672
11~159minus115357 015667 73637
16~209minus057984 021655 10178
21~259minus000610 021729 10213
26~309056763 015828 74393
31~359 114137 008370 3933736~409 171510 003212 15096
ge41 228884 001104 051909
19766
202877
P(Zltminus230104)
P(-230ltZltminus173)
P(172ltZlt229)
P(Zgt229)
119874119894 minus 1198641198942119864119894 =14762 gt qchisq(0055lowertail=F)= 11071
-gt Reject Ho data ~ normal
constraints ( 119864119894 = 119874119894 120583 = 119883 120590 = 119904)=3 -gt df=8minus3=5
계급구간 관측도수(119926119946) 기대도수(119916119946)119926119946 minus119916119946
120784119916119946
lt 1 0 050265 2760810-4
1-59 2 147406-109 2 38672
090156
11-159 7 73637001796
16-209 19 10178 7646621-259 4 10213 3779426-309 6 74393 02784831-359 3 39337 022162
36-4094 15096 19156
ge 41 0 051909Total 47 47 14762
4
197662
20287
보기 1122 이항분포 (binomial distrsquon)
H0 자료는 이항분포를 따른다 (적합도검정)
각 의사별 신약을 선호하는 환자의 수 의사의 수 환자의 수
0 5 0
1 6 6
2 8 16
3 10 30
4 10 40
5 15 75
6 17 102
7 10 70
8 10 80
9 9 81
10 이상 0 0
합 100 500
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
112 적합도 검정(Goodness-of- fit)
bull 우리의 data가 가설상의 분포(정규분포 이항분포 포아슨 분포 등)와 일치하는가
bull Data = theoretical distribution (normal binomial Poisson etc)
H0 정규분포를 따른다 vs H1 not H0
콜레스테롤 수치(mgdl) 대상자 수1-59 2
6-109 2
11-159 7
16-209 19
21-259 4
26-309 6
31-359 3
36-409 4
bull보기 1121(Normal distrsquon)
2
2 2
1
~k
i i
k ri i
O E
E
Oi E
i O Ei i
관측치(observed) 기대치(expected)
r 제약조건 ( )+추정하는 모수의 개수
restriction parameters estimated
interval expected rel freq expected freq
bull 119909 =35∙2+85∙2+⋯+385∙4
2+2+⋯+4= 2105319
bull 1199042 =35minus2105319 2∙2+ 85minus2105319 2∙2+⋯+ 385minus2105319 2∙4
2+2+⋯+4minus1
= 759482
119904 = 759482 = 871483
계급구간(interval)
표준화된 계급구간(standardized interval)
상대도수의 기대치(relative frequency)
기대도수(expected frequency)
lt1 001069 0502651~ 59 minus230104 003136 14740
6~109minus172731 008228 38672
11~159minus115357 015667 73637
16~209minus057984 021655 10178
21~259minus000610 021729 10213
26~309056763 015828 74393
31~359 114137 008370 3933736~409 171510 003212 15096
ge41 228884 001104 051909
19766
202877
P(Zltminus230104)
P(-230ltZltminus173)
P(172ltZlt229)
P(Zgt229)
119874119894 minus 1198641198942119864119894 =14762 gt qchisq(0055lowertail=F)= 11071
-gt Reject Ho data ~ normal
constraints ( 119864119894 = 119874119894 120583 = 119883 120590 = 119904)=3 -gt df=8minus3=5
계급구간 관측도수(119926119946) 기대도수(119916119946)119926119946 minus119916119946
120784119916119946
lt 1 0 050265 2760810-4
1-59 2 147406-109 2 38672
090156
11-159 7 73637001796
16-209 19 10178 7646621-259 4 10213 3779426-309 6 74393 02784831-359 3 39337 022162
36-4094 15096 19156
ge 41 0 051909Total 47 47 14762
4
197662
20287
보기 1122 이항분포 (binomial distrsquon)
H0 자료는 이항분포를 따른다 (적합도검정)
각 의사별 신약을 선호하는 환자의 수 의사의 수 환자의 수
0 5 0
1 6 6
2 8 16
3 10 30
4 10 40
5 15 75
6 17 102
7 10 70
8 10 80
9 9 81
10 이상 0 0
합 100 500
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
bull보기 1121(Normal distrsquon)
2
2 2
1
~k
i i
k ri i
O E
E
Oi E
i O Ei i
관측치(observed) 기대치(expected)
r 제약조건 ( )+추정하는 모수의 개수
restriction parameters estimated
interval expected rel freq expected freq
bull 119909 =35∙2+85∙2+⋯+385∙4
2+2+⋯+4= 2105319
bull 1199042 =35minus2105319 2∙2+ 85minus2105319 2∙2+⋯+ 385minus2105319 2∙4
2+2+⋯+4minus1
= 759482
119904 = 759482 = 871483
계급구간(interval)
표준화된 계급구간(standardized interval)
상대도수의 기대치(relative frequency)
기대도수(expected frequency)
lt1 001069 0502651~ 59 minus230104 003136 14740
6~109minus172731 008228 38672
11~159minus115357 015667 73637
16~209minus057984 021655 10178
21~259minus000610 021729 10213
26~309056763 015828 74393
31~359 114137 008370 3933736~409 171510 003212 15096
ge41 228884 001104 051909
19766
202877
P(Zltminus230104)
P(-230ltZltminus173)
P(172ltZlt229)
P(Zgt229)
119874119894 minus 1198641198942119864119894 =14762 gt qchisq(0055lowertail=F)= 11071
-gt Reject Ho data ~ normal
constraints ( 119864119894 = 119874119894 120583 = 119883 120590 = 119904)=3 -gt df=8minus3=5
계급구간 관측도수(119926119946) 기대도수(119916119946)119926119946 minus119916119946
120784119916119946
lt 1 0 050265 2760810-4
1-59 2 147406-109 2 38672
090156
11-159 7 73637001796
16-209 19 10178 7646621-259 4 10213 3779426-309 6 74393 02784831-359 3 39337 022162
36-4094 15096 19156
ge 41 0 051909Total 47 47 14762
4
197662
20287
보기 1122 이항분포 (binomial distrsquon)
H0 자료는 이항분포를 따른다 (적합도검정)
각 의사별 신약을 선호하는 환자의 수 의사의 수 환자의 수
0 5 0
1 6 6
2 8 16
3 10 30
4 10 40
5 15 75
6 17 102
7 10 70
8 10 80
9 9 81
10 이상 0 0
합 100 500
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
계급구간(interval)
표준화된 계급구간(standardized interval)
상대도수의 기대치(relative frequency)
기대도수(expected frequency)
lt1 001069 0502651~ 59 minus230104 003136 14740
6~109minus172731 008228 38672
11~159minus115357 015667 73637
16~209minus057984 021655 10178
21~259minus000610 021729 10213
26~309056763 015828 74393
31~359 114137 008370 3933736~409 171510 003212 15096
ge41 228884 001104 051909
19766
202877
P(Zltminus230104)
P(-230ltZltminus173)
P(172ltZlt229)
P(Zgt229)
119874119894 minus 1198641198942119864119894 =14762 gt qchisq(0055lowertail=F)= 11071
-gt Reject Ho data ~ normal
constraints ( 119864119894 = 119874119894 120583 = 119883 120590 = 119904)=3 -gt df=8minus3=5
계급구간 관측도수(119926119946) 기대도수(119916119946)119926119946 minus119916119946
120784119916119946
lt 1 0 050265 2760810-4
1-59 2 147406-109 2 38672
090156
11-159 7 73637001796
16-209 19 10178 7646621-259 4 10213 3779426-309 6 74393 02784831-359 3 39337 022162
36-4094 15096 19156
ge 41 0 051909Total 47 47 14762
4
197662
20287
보기 1122 이항분포 (binomial distrsquon)
H0 자료는 이항분포를 따른다 (적합도검정)
각 의사별 신약을 선호하는 환자의 수 의사의 수 환자의 수
0 5 0
1 6 6
2 8 16
3 10 30
4 10 40
5 15 75
6 17 102
7 10 70
8 10 80
9 9 81
10 이상 0 0
합 100 500
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
119874119894 minus 1198641198942119864119894 =14762 gt qchisq(0055lowertail=F)= 11071
-gt Reject Ho data ~ normal
constraints ( 119864119894 = 119874119894 120583 = 119883 120590 = 119904)=3 -gt df=8minus3=5
계급구간 관측도수(119926119946) 기대도수(119916119946)119926119946 minus119916119946
120784119916119946
lt 1 0 050265 2760810-4
1-59 2 147406-109 2 38672
090156
11-159 7 73637001796
16-209 19 10178 7646621-259 4 10213 3779426-309 6 74393 02784831-359 3 39337 022162
36-4094 15096 19156
ge 41 0 051909Total 47 47 14762
4
197662
20287
보기 1122 이항분포 (binomial distrsquon)
H0 자료는 이항분포를 따른다 (적합도검정)
각 의사별 신약을 선호하는 환자의 수 의사의 수 환자의 수
0 5 0
1 6 6
2 8 16
3 10 30
4 10 40
5 15 75
6 17 102
7 10 70
8 10 80
9 9 81
10 이상 0 0
합 100 500
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
보기 1122 이항분포 (binomial distrsquon)
H0 자료는 이항분포를 따른다 (적합도검정)
각 의사별 신약을 선호하는 환자의 수 의사의 수 환자의 수
0 5 0
1 6 6
2 8 16
3 10 30
4 10 40
5 15 75
6 17 102
7 10 70
8 10 80
9 9 81
10 이상 0 0
합 100 500
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
이항분포의 가정하에서 기대도수=기대상대도수총합
Expected freq under binomial distrsquon=probtotal
2525
( ) (1 ) 012 25
ˆ 500 2500 02
x xP X x p p xx
p
각 의사별 신약을 선호하는 환자의 수 의사의 수(119926119946) 기대 상대도수 기대도수(119916119946)
0 5 000378 037779
1 6 002361 23612
2 8 007083 70836
3 10 013577 13577
4 10 018668 18668
5 15 019602 19602
6 17 016335 16335
7 10 011084 11084
8 10 006235 62349
9 9 002944 29442
10 이상 0 001733 17332
합계 100 10000 10000
11 27390
1205682 =11minus27390 2
27390+8minus70836 2
70836+⋯+
0minus17332 2
17332= 47678 gt qchisq(00058lowertail=F)= 21955
We reject Ho Data ~ Binomial
df= 10 minus 2 = 8 constraints 119864119894 = 119874119894 119901 = 119901
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
예제 1123 포아슨분포 (Poisson distrsquon)
포아슨분포의 가정 하에서 상대도수의 기대치
Expected relative freq under Poisson distrsquon
(X ) 012
xeP x x
x
=3 known
H0 병원의 하루 응급환자의 수는 포아송 분포를 따른다
일일 응급환자 수 날짜 수
0 5
1 14
2 15
3 23
4 16
5 9
6 3
7 3
8 1
9 1
10 이상 0
합계 90
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
응급환자수 날짜 수(119926119946) 기대 상대도수 기대도수(119916119946)119926119946 minus 119916119946
120784
119916119946
0 5 004979 44808 006015
1 14 014936 13443 002312
2 15 022404 20164 132240
3 23 022404 20164 039895
4 16 016803 15123 005088
5 9 010082 90737 000060
6 3 005041 45368 052060
7 3 002160 19444 057313
8 1 000810 072914
0804829 1 000270 024305
10 이상 0 000110 009922
합계 90 1000 9000 3755
107142
1205682 = 119874119894minus119864119894
2
119864119894=5minus44808 2
44808+⋯+
2minus10714 2
10714= 3755 lt 1198832(095 119889119891 = 9 minus 1 = 8) = 15507
We cannot reject Ho Data ~ Poisson
2 22 2
9 1
(5 450) (2 108)3664 15557 (095)
450 108
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
113 독립성검정Tests of independence
bull 분할표(contingency table)
1205682 =
119894=1
119903
119895=1
119888119874119894119895 minus 119864119894119895
2
119864119894119895~1205942 119889119891 = 119903 minus 1 119888 minus 1 119864119894119895=
119899119894 ∙ 119899119895119899
두 번째 범주형 변수 첫 번째 범주형 변수
120783 120784 120785 ⋯ 119940 합계
120783 11989911 11989912 11989913 ⋯ 1198991119888 1198991
120784 11989921 11989922 11989923 ⋯ 1198992119888 1198992
120785 11989931 11989932 11989933 ⋯ 1198993119888 1198993
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
119955 1198991199031 1198991199032 1198991199033 ⋯ 119899119903119888 119899119903
합계 1198991 1198992 1198993 ⋯ 119899119888 119899
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
bull예제 1131
치료방법(treatment)
재발여부(relapse)합계(total)
Yes No
A 294 (77255) 921 (1137745) 1215
B 98 (188210) 2862 (2771790) 2960
C 50 (198002) 3064 (2915998) 3114
D 203 (181533) 2652 (2673467) 2855
합계 645 9499 10144
1205682 = 119874 minus 119864 2
119864
=294 minus 77255 2
77255+921 minus 1137745 2
1137745+⋯ = 81641
gt 1198832(095 119889119891 = 3) = 7815df= 119903 minus 1 119888 minus 1 = 4 minus 1 2 minus 1 = 3
Reject (Ho treatment and relapse are independent ) -gt They are not independent
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
gt datalt-astable(cbind(c(2949850203)c(921286230642652)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(YN))
gt data
re
trt Y N
A 294 921
B 98 2862
C 50 3064
D 203 2652
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 81641 df = 3 p-value lt 22e-16
data reinput trt $ re $ count cardsA Y 294 A N 921 B Y 98 B N 2862C Y 50 C N 3064D Y 203 D N 2652proc freq data=reweight counttables trtremeasures chisqrun
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
bull 작은 기대도수 (small expected freq)기대치 5미만의 cell수가 전체 20를 넘지 않으며 최소기대치가 1이상이면 무관하다 (If min gt1 and cells lt5 are less than 20 then not a problem)
bull 2Ⅹ2 분할표 (table)nlt20 or 20ltnlt49 그리고 기대도수 5이하 일경우에는 -test를 하지 말라
-test is not valid if nlt20 or (20ltnlt49) and expected freq of one or more cells lt 5
2
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
2
bull2Ⅹ2 table
1205682 =233(131∙36minus52∙14)2
145∙88∙183∙50= 317391 gtgt 1962
Strong evidence to reject (HoSmoking and drinking are independent)
두번째분류기준
첫번째 분류기준
120783 120784 합계
120783 119886 119887 119886 + 119887
120784 119888 119889 119888 + 119889
합계 119886 + 119888 119887 + 119889 119899
SmokingDrinking
Yes No total
Yes 131 52 183
No 14 36 50
Total 145 88 233
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
②두 집단의 확률에 대한 비교
(Comparing two probabilities)
1 1 2
0 1 2 1 2
1 1 2
1 2
ˆ100 60 120 040
60 100 40 12004909
100 120
060 040295469 196 significant
4903 5091 4903 5091
100 120
ˆ ( )
(1 ) (1 )
a
n p n
p
Z
H p p H p p
p p pZ
p p p p
n n
e g
2p
2p
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
Yates adjustment (보정)
bull 120568corrected2 =
119899( 119886119889minus119887119888 minus05119899)2
(119886+119888)(119887+119889)(119886+119887)(119888+119889)
bull 1205682 =233( 131∙36minus52∙14 minus05∙233)2
145∙88∙183∙50= 299118
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
114 동질성 검정 (homogeneity test)
bull 동질성 검정 각 각의 모집단에서 독립적으로 뽑은 표본들의 분포가 서로 동질의 것인가
bull Homogeneity test Are two samples selected from one population
bull 독립성 검정 한 모집단에서 표본 추출 행과 열의 합계는 조절이 아니고 우연히 나타난다
bull Independent test selected from a population Marginal totals are randomly determined
bull 독립성 검정 vs 동질성 검정
bull Independent test vs homogeneity test
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
bull예제 1141
bull 가설 Patient groups with on-set age lt=18 and age gt 18 have same distributions of family history
gt datalt-astable(cbind(c(28194153)c(35384460)))
gt dimnames(data)lt-list(trt=c(ABCD)re=c(EarlyLater))
gt chisqtest(data)
Pearsons Chi-squared test
data data
X-squared = 36216 df = 3 p-value = 03053
-gt Do not reject Ho
0H
Family History lt=18 gt 18 Total
A 28 35 63
B 19 38 57
C 41 44 85
D 53 60 113
합계 141 177 318
gt datare
trt Early LaterA 28 35B 19 38C 41 44D 53 60
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
동질성 검정과 모비율 검정
1198670 1199011 = 1199012 119907119904 119867119860 ∶ 1199011 ne 1199012 1198991 = 100 1199011 = 060 1198992 = 120 1199012 = 040
119911 = 1199011minus 1199012 minus( 1199011minus 1199012)0 119901(1minus 119901)
1198991+ 119901(1minus 119901)
1198992
119901 =060∙100+040∙120
100+120=108
220= 049091
119911 =060 minus 040
049091 ∙ 050909100 +
049091 ∙ 050909120
= 295468
1205682 =220 ∙ [60 ∙ 72 minus 40 ∙ 48]2
108 ∙ 112 ∙ 100 ∙ 120= 87302
-gt Reject Ho
표본특성
1 2 합계1 60 40 100
2 48 72 120
합계 108 112 220
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
data severe
input treat $ outcome $ count
cards
Test f 10
Test u 2
Control f 2
Control u 4
proc freq order=data
tables treatoutcome chisq nocol
weight count
run
Fisherrsquos Exact Test
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
SAS 시스템
FREQ 프로시저
treat outcome 교차표
treat outcome
빈도|백분율|
행 백분율|f |u | 총합-----------+--------+--------+
Test | 10 | 2 | 12
| 5556 | 1111 | 6667
| 8333 | 1667 |
-----------+--------+--------+
Control | 2 | 4 | 6
| 1111 | 2222 | 3333
| 3333 | 6667 |
-----------+--------+--------+
총합 12 6 18
6667 3333 10000
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
treat outcome 테이블에 대한 통계량
통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 45000 00339우도비 카이제곱 1 44629 00346연속성 수정 카이제곱 1 25313 01116Mantel-Haenszel 카이제곱 1 42500 00393파이 계수 05000분할 계수 04472크래머의 V 05000
경고 셀들의 75가 5보다 작은 기대도수를 가지고 있습니다카이제곱 검정은 올바르지 않을 수 있습니다
Fisher의 정확 검정----------------------------(11) 셀 빈도(F) 10하단측 p값 Pr lt= F 09961상단측 p값 Pr gt= F 00573
테이블 확률 (P) 00533양측 p값 Pr lt= P 01070
표본 크기 = 18
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
Exact Test
Table Cell
(11) (12) (21) (22) Prob
12 0 0 6 0001
11 1 1 5 0039
10 2 2 4 0533
9 3 3 3 2370
8 4 4 2 4000
7 5 5 1 2560
6 6 6 0 0498
=12 12 6 6
10 2 2 4 18
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
Table Probabilities
bull One-tailed p-value
bull Two-tailed p-value
00533 00039 00001 00573p
00533 00039 00001 00498 01071p
H0 두 변수는 서로 독립(동질)이다 vs H1 not H0
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
gt fishertest(matrix(c(7356)22)alternative=greater)
Fishers Exact Test for Count Data
data matrix(c(7 3 5 6) 2 2)
p-value = 02449
alternative hypothesis true odds ratio is greater than 1
95 percent confidence interval
04512625 Inf
sample estimates
odds ratio
2661251
gt matrix(c(7356)22)
[1] [2]
[1] 7 5
[2] 3 6
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
McNemar Test Matched pairs
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
data one
input hus_resp $ wif_resp $ no
datalines
yes yes 20
yes no 5
no yes 10
no no 10
run
proc freq
tables hus_respwif_resp agree
weight no
run
ldquoHo husband and wife 의 approval rates는 같다rdquo를 기각하지 못함
We do not reject ldquoHo approval rates of husband and wife are the samerdquo
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
신뢰구간이 0을 포함하지 않으므로 K=0 이라는 귀무가설을 95 신뢰수준에서 기각한다
Kappa=1 gtgt perfect agreement Kappa gt 08 gtgt excellent agreement Kappa gt 04 gtgt moderate agreement
CI does not include 0 -gt we reject the null hypo of K=0 by 95 confidence level
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
116 Relative risk odds ratio and Mantel-Haenszel statistics
bull 관찰연구 (observational study)
bull 전향적 연구 (prospective study)
bull 후향적 연구 (retrospective study)
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
상대위험도 (Relative risk)Disease
Risk O X
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull 119877119877 =119886
119886+119887119888
119888+119889
bull s e ln 119877119877 =1
119886+1
119888minus
1
119886+119887+
1
119888+119889
bull ln 119877119877 plusmn 1199111minus1205722 ∙ s e ln 119877119877
bull 119890ln 119877119877 plusmn119911
1minus1205722∙se ln 119877119877
= 119877119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119877119877
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
예제 1161 Relative risk odds ratio and Mantel-Haenszel statistics
119877119877 =26406
1783564=0064
0050= 128
12822 ∙ 119890^ minus196 ∙1
26+
1
178minus
1
406+
1
3564= 0861
12822 ∙ 119890^ 196 ∙1
26+
1
178minus
1
406+
1
3564= 1910
95 CI includes 1
Smoking Disease progress
Yes NoYes 26 380 406
No 178 3386 3564
204 3766 3970
표1162
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
data preg
input smoke $ preg $ count
cards
smoke early 26
smoke abnormal 380
nonsmoke early 178
nonsmoke abnormal 3386
proc freq order=data
weight count
tables smokepregmeasures chisq
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
오즈비 (Odds Ratio)sample
Risk case control
O 119886 119887 119886 + 119887
X 119888 119889 119888 + 119889
119886 + 119888 119887 + 119889 119899
bull Odds = p(1-p)
bull 환자집단 오즈 [119886(119886 + 119888)][119888(119886 + 119888)] = 119886119888
bull 정상집단 오즈 [119887(119887 + 119889)][119889(119887 + 119889)] = 119887119889
bull 119874119877 =119886
119888119887
119889
=119886119889
119887119888s e ln 119874119877 =
1
119886+1
119887+1
119888+1
119889
CI ln 119874119877 plusmn 1199111minus1205722 ∙ s e ln 119874119877
bull 119890ln 119874119877 plusmn119911
1minus1205722∙se ln OR
=
= 119874119877 ∙ 119890^ plusmn1199111minus1205722 ∙ s e ln 119874119877
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
예제 1162
119874119877 =52 ∙ 3486
352 ∙ 78= 660
66023 ∙ 119890^ minus196 ∙1
52+
1
352+1
78+
1
3486= 4571
66023 ∙ 119890^ 196 ∙1
52+
1
352+1
78+
1
3486= 9536
Smoking
Obesity
Yes No
Yes 52 352 404
No 78 3486 3564
130 3838 3968
표1164
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
data obe
input smoke $ case $ count
cards
smoke case 52
smoke control 352
nonsmoke case 78
nonsmoke control 3486
proc freq data=obe
weight count
tables smokecasemeasures
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
Mantel-Haenszel 통계량
교란변수가 k개의 층
층 i의 기대도수 119890119894=119886119894+119887119894 119886119894+119888119894
119899119894
119907119894 =119886119894 + 119887119894 119888119894 + 119889119894 119886119894 + 119888119894 119887119894 + 119889119894
1198991198942(119899119894 minus 1)
1205941198721198672 =
( 119894=1119896 119886119894 minus 119894=1
119896 119890119894)2
119894=1119896 119907119894
~12059412 119874119877119872119867 =
119894=1119896 (119886119894119889119894119899119894)
119894=1119896 (119887119894119888119894119899119894)
공통오즈비
Risk
Strata =i
case control
Exposed 119886119894 119887119894 119886119894 + 119887119894
Not 119888119894 119889119894 119888119894 + 119889119894
119886119894 + 119888119894 119887119894 + 119889119894 119899119894
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
예제 1163
Age lt= 55
Risk OCAD patients control Total
Exposed 21 11 32
Unexposed 16 6 22
합계 37 17 54
Age gt= 56
Risk OCAD 환자 정상 합계
Exposed 50 14 64
Unexposed 18 6 24
68 20 88
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run
data one
input age risk case n
cards
1 1 1 21
1 1 2 11
1 2 1 16
1 2 2 6
2 1 1 50
2 1 2 14
2 2 1 18
2 2 2 6
run
proc freq
tables ageriskcasemeasures CMH
weight n
run