logistic multiple 2558u - @@ home - kku web hosting · 2016. 3. 22. · iteration 0: log likelihood...

23
1 Multiple Logistic Regression ผู ้ช่วยศาสตราจารย์นิคม ถนอมเสียง ภาควิชาชีวสถิติและประชากรศาสตร์ คณะสาธารณสุขศาสตร์ มหาวิทยาลัยขอนแก่น 0 1 1/2 ) ( e 1 1 ) f(- <------- Z -------> Logistic function ) ( e 1 1 ) f( 0 e 1 1 1 e 1 1 Fitting Multiple Logistic Regression วิเคราะห์ความสัมพันธ์ระหว่างตัวแปรอิสระ 2 ตัวแปร กับตัวแปรตาม ตัวแปรตาม (Dependent, Outcome, Response) = discrete (two possible) ตัวแปรอิสระ (independent, predictor, explanatory) = continuous, categorical (--> dummy) Outcome V predictor predictor predictor ... Multiple Logistic Regression ตัวอย่าง การวิเคราะห์ความสัมพันธ์ระหว่างตัวแปรอายุ เชื ้อชาติ นํ้าหนักทีเพิ ่มขึ ้น การสูบบุหรี ฯลฯ กับการเกิด low birth weight LBW 0 >=2500 1 <2500 Age (year) Race 1=white,2=black,3=other Lwt = weight mothers at last period Smk 1=yes 0=no ... FTV = number physician visit during first Trimester การวิเคราะห์ Logistic Regression เขียนความสัมพันธ์แบบ Logit ได้ดังนี p p x x x p p x y ... ˆ 1 ˆ ln ) ( ˆ 2 2 1 0 p p p p X X X e X X X e p ... 1 ... ˆ 2 2 1 1 0 2 2 1 1 0 ความน่าจะเป็นในการเกิดเหตุการณ์ ) ( ˆ i x p ตัวแปรอิสระที่อยู ่ในโมเดล โมเดลของ logit กรณีมีตัวแปรแบบ Polychotomous ให้ทําให้เป็นตัวแปรหุ ่น (dummy variable) p p k l jl jl x D x y j 1 1 0 1 ˆ ตัวอย่าง กรณีมีตัวแปรมี k ระดับ สร้างตัวแปรหุ ่น ได้เท่ากับ k-1 ตัวแปร (k=ระดับ, กลุ ่ม) ตัวแปรหุ ่น (dummy variable) variable D1 D2 code=1 0 0 code=2 1 0 code=3 0 1 (ftv) β ) (race β ) (race β (lwt) β (age) β β y others B 5 2 0 4 3 1 ˆ เชื ้อชาติ D1 D2 ขาว 0 0 ดํา 1 0 อื นๆ 0 1 ตัวอย่าง ตัวแปรเชื ้อชาติ (ขาว, ดํา, อื ่นๆ)ให้ทําเป็น ตัวแปรหุ ่น (dummy variables) ดังนี STATA ระบุ xi: logit low age lwt i.race ftv

Upload: others

Post on 25-Nov-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

1

Multiple Logistic Regression

ผชวยศาสตราจารยนคม ถนอมเสยง

ภาควชาชวสถตและประชากรศาสตร

คณะสาธารณสขศาสตร มหาวทยาลยขอนแกน

0

1

1/2)(

e1

1)f(-

<------- Z ------->

Logistic function

)(e1

1)f(

0

e1

1

1

e1

1

Fitting Multiple Logistic Regression

วเคราะหความสมพนธระหวางตวแปรอสระ 2 ตวแปร

กบตวแปรตาม

ตวแปรตาม (Dependent, Outcome, Response) = discrete

(two possible)

ตวแปรอสระ (independent, predictor, explanatory)

= continuous, categorical (--> dummy)

Outcome Vpredictor

predictor

predictor ...

Multiple Logistic Regression

ตวอยาง การวเคราะหความสมพนธระหวางตวแปรอาย

เชอชาต นาหนกทเพมขน การสบบหร ฯลฯ

กบการเกด low birth weight

LBW0 >=25001 <2500 Age (year)

Race 1=white,2=black,3=other

Lwt = weight mothers at last periodSmk1=yes0=no... FTV = number physician visit

during first Trimester

การวเคราะห Logistic Regression

เขยนความสมพนธแบบ Logit ไดดงน

pp xxxp

pxy

...ˆ1

ˆln)(ˆ 2210

pp

pp

XXXe

XXXe

p

...1

...ˆ

22110

22110

ความนาจะเปนในการเกดเหตการณ

)(ˆ ixp

ตวแปรอสระทอยในโมเดล

โมเดลของ logit กรณมตวแปรแบบ Polychotomous

ใหทาใหเปนตวแปรหน (dummy variable)

pp

k

ljljl xDxy

j

1

10 1

ˆ

ตวอยาง กรณมตวแปรม k ระดบ สรางตวแปรหน

ไดเทากบ k-1 ตวแปร (k=ระดบ, กลม)

ตวแปรหน (dummy variable)

variable D1 D2

code=1 0 0

code=2 1 0

code=3 0 1

(ftv)β)(raceβ)(raceβ(lwt)β(age)ββy othersB 520 431ˆ

เชอชาต D1 D2

ขาว 0 0

ดา 1 0

อนๆ 0 1

ตวอยาง ตวแปรเชอชาต (ขาว, ดา, อนๆ)ใหทาเปน

ตวแปรหน (dummy variables) ดงน

STATA ระบ xi: logit low age lwt i.race ftv

Page 2: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

2

การวเคราะห Multiple Logistic Regression ระหวาง Low Birth

Weight และ age, lwt, race, ftv

ftvβIraceβIraceβlwtβageβp

py

5ˆ3__

4ˆ2__

ˆ1

ˆlnˆ 0

. xi: logit low age lwt i.race ftv, nologi.race _Irace_1-3 (naturally coded; _Irace_1 omitted)

Logistic regression Number of obs = 189LR chi2(5) = 12.10Prob > chi2 = 0.0335

Log likelihood = -111.28645 Pseudo R2 = 0.0516

------------------------------------------------------------------------------low | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | -.023823 .0337295 -0.71 0.480 -.0899317 .0422857lwt | -.0142446 .0065407 -2.18 0.029 -.0270641 -.0014251

_Irace_2 | 1.003898 .4978579 2.02 0.044 .0281143 1.979681_Irace_3 | .4331084 .3622397 1.20 0.232 -.2768684 1.143085

ftv | -.0493083 .1672386 -0.29 0.768 -.3770899 .2784733_cons | 1.295366 1.071439 1.21 0.227 -.8046157 3.395347

------------------------------------------------------------------------------

. list id low age lwt _Irace_2 _Irace_3 ftv phat

+--------------------------------------------------------------+| id low age lwt _Irace_2 _Irace_3 ftv phat ||--------------------------------------------------------------|

1. | 4 1 28 120 0 1 0 .3434579 |2. | 10 1 29 130 0 0 2 .2065388 |3. | 11 1 34 187 1 0 0 .2360498 |4. | 13 1 25 105 0 1 0 .4102857 |5. | 15 1 25 85 0 1 0 .4805368 |

|--------------------------------------------------------------|...

186. | 223 0 35 170 0 0 1 .1182268 |187. | 224 0 19 120 0 0 0 .2959572 |188. | 225 0 24 116 0 0 1 .2732751 |189. | 226 0 45 123 0 0 1 .1710699 |

+--------------------------------------------------------------+

ftvIraceIracelwtagee

ftvIraceIracelwtagee

p543210

543210

3__2__1

3__2__ˆ

การ Fit Model ในการวเคราะห Logistic Regression

-คานวณคา coefficient ดวยวธ Maximum Likelihood/IRLS

คนควา /ศกษา

Generalized Linear Model:

- Random component or Family: binomial

- Link Function : logit ดงนน

- Systematic component : x1, x

2,… x

p โมเดลเชงเสนเขยนไดเปน

p

p

μ

μg

1ln

1ln)(

pp xxxp

pxy

...ˆ1

ˆln)(ˆ 2210

การทดสอบระดบนยสาคญของ Model

-ใชสถต likelihood ratio test (G ) ระหวางโมเดลทมเฉพาะ

constant กบ fitted Model

-นยสาคญของตวแปรแตละตว ดวย Wald Test

variablethewithlikelihood

variablethewithoutlikelihood2lnG

)(

ˆ

se

Z jj

การทดสอบระดบนยสาคญของ Model

- ใชสถต likelihood ratio test (G ) ระหวางโมเดลทมเฉพาะ

constant กบ fitted Model ดงน

. xi: logit low age lwt i.race ftvi.race _Irace_1-3 (naturally coded; _Irace_1 omitted)

Iteration 0: log likelihood = -117.336Iteration 1: log likelihood = -111.41656Iteration 2: log likelihood = -111.28677Iteration 3: log likelihood = -111.28645

Logit estimates Number of obs = 189LR chi2(5) = 12.10Prob > chi2 = 0.0335

Log likelihood = -111.28645 Pseudo R2 = 0.0516

------------------------------------------------------------------------------low | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | -.023823 .0337295 -0.71 0.480 -.0899317 .0422857lwt | -.0142446 .0065407 -2.18 0.029 -.0270641 -.0014251

_Irace_2 | 1.003898 .4978579 2.02 0.044 .0281143 1.979681_Irace_3 | .4331084 .3622397 1.20 0.232 -.2768684 1.143085

ftv | -.0493083 .1672386 -0.29 0.768 -.3770899 .2784733_cons | 1.295366 1.071439 1.21 0.227 -.8046157 3.395347

------------------------------------------------------------------------------

G = -2[(-117.336)-(-111.286))] =12.099

Iteration 0: log likelihood = -117.336Iteration 1: log likelihood = -111.41656Iteration 2: log likelihood = -111.28677Iteration 3: log likelihood = -111.28645

Logit estimates Number of obs = 189LR chi2(5) = 12.10Prob > chi2 = 0.0335

Log likelihood = -111.28645 Pseudo R2 = 0.0516

แสดงวา มตวแปรอยางนอย 1 ตวแปรมคาสมประสทธ

แตกตางจาก 0

Page 3: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

3

การทดสอบระดบนยสาคญของ Model

-ใชสถต likelihood ratio test (G ) ระหวางโมเดล เชน

Model 1

Model 2

)()()()(1

ln 43210 othersB raceracelwtagep

p

. use "H:\Hosmer_logistic\alr_data_Hosmer\logistic\lwt_2556.dta", clear

. xi: logit low age lwt i.racei.race _Irace_1-3 (naturally coded; _Irace_1 omitted)Iteration 0: log likelihood = -117.336…Iteration 3: log likelihood = -111.33032Logistic regression Number of obs = 189

LR chi2(4) = 12.01Prob > chi2 = 0.0173

Log likelihood = -111.33032 Pseudo R2 = 0.0512------------------------------------------------------------------------------

low | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | -.0255238 .033252 -0.77 0.443 -.0906966 .039649lwt | -.0143532 .0065228 -2.20 0.028 -.0271377 -.0015688

_Irace_2 | 1.003822 .4980135 2.02 0.044 .0277335 1.97991_Irace_3 | .4434608 .3602569 1.23 0.218 -.2626298 1.149551

_cons | 1.306741 1.069782 1.22 0.222 -.7899926 3.403475------------------------------------------------------------------------------. est store m1

)()()()()(1

ln 543210 ftvraceracelwtagep

pothersB

. xi: logit low age lwt i.race ftvi.race _Irace_1-3 (naturally coded; _Irace_1 omitted)Iteration 0: log likelihood = -117.336Iteration 1: log likelihood = -111.41656Iteration 2: log likelihood = -111.28677Iteration 3: log likelihood = -111.28645Logistic regression Number of obs = 189

LR chi2(5) = 12.10Prob > chi2 = 0.0335

Log likelihood = -111.28645 Pseudo R2 = 0.0516------------------------------------------------------------------------------

low | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | -.023823 .0337295 -0.71 0.480 -.0899317 .0422857lwt | -.0142446 .0065407 -2.18 0.029 -.0270641 -.0014251

_Irace_2 | 1.003898 .4978579 2.02 0.044 .0281143 1.979681_Irace_3 | .4331084 .3622397 1.20 0.232 -.2768684 1.143085

ftv | -.0493083 .1672386 -0.29 0.768 -.3770899 .2784733_cons | 1.295366 1.071439 1.21 0.227 -.8046157 3.395347

------------------------------------------------------------------------------. est store m2. lrtest m1 m2Likelihood-ratio test LR chi2(1) = 0.09(Assumption: m1 nested in m2) Prob > chi2 = 0.7671

. di -2*((-111.33032)-(-111.28645))

.08774

. di chiprob(1,.08774)

.76707018

G = -2ln(likelihood without the variable-likelihood with the variable)

การมนยสาคญของตวแปรแตละตวดวย Wald Test

)(

ˆ

seZ jj

------------------------------------------------------------------------------low | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | -.023823 .0337295 -0.71 0.480 -.0899317 .0422857lwt | -.0142446 .0065407 -2.18 0.029 -.0270641 -.0014251

_Irace_2 | 1.003898 .4978579 2.02 0.044 .0281143 1.979681_Irace_3 | .4331084 .3622397 1.20 0.232 -.2768684 1.143085

ftv | -.0493083 .1672386 -0.29 0.768 -.3770899 .2784733_cons | 1.295366 1.071439 1.21 0.227 -.8046157 3.395347

------------------------------------------------------------------------------

Confidence Interval Estimation

-Estimate confidence of coefficient

)ˆ(ˆ)1(100 2/ seZof%CI i

xi: logit low lwt i.racei.race _Irace_1-3 (naturally coded; _Irace_1 omitted)…

Logistic regression Number of obs = 189LR chi2(3) = 11.41Prob > chi2 = 0.0097

Log likelihood = -111.62955 Pseudo R2 = 0.0486

------------------------------------------------------------------------------low | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------lwt | -.0152231 .0064393 -2.36 0.018 -.0278439 -.0026023

_Irace_2 | 1.081066 .4880512 2.22 0.027 .1245034 2.037629_Irace_3 | .4806033 .3566733 1.35 0.178 -.2184636 1.17967

_cons | .8057535 .8451625 0.95 0.340 -.8507345 2.462241------------------------------------------------------------------------------

p

ijiji

p

i

p

iii

p

iii voCxxraVxxraV

0 11

2

0

)ˆ,ˆ(ˆ2)ˆ(ˆ)ˆ(ˆ

การประมาณคาความนาจะเปนรายขอมลและชวงเชอมน

Individual Predicted probability & Confidence Interval

Estimation

-Estimate Variance of logit

)ˆ()(ˆ)(5)1(1000

2/

p

iiii xseZxpxpof%CI

ตวอยาง การคานวณความแปรปรวน เมอ lwt=150 race=White

)]ˆ,ˆ()][()][([2

)]ˆ,ˆ()][()[(2

)]ˆ,ˆ()][()[(2)]ˆ,ˆ()][([2

)]ˆ,ˆ()][([2)]ˆ,ˆ()[(2

)]ˆ(][)([)]ˆ(][)([

)]ˆ()[()ˆ(ˆ)],150(ˆ[ˆ

32

31

2130

2010

32

22

12

0

Covblackraceblackrace

Covotherracelwt

CovblackracelwtCovotherrace

CovblackraceCovlwt

VarotherraceVarblackrace

VarlwtraVwhiteracelwtyraV

p

ijiji

p

i

p

iii

p

iii voCxxraVxxraV

0 11

2

0

)ˆ,ˆ(ˆ2)ˆ(ˆ)ˆ(ˆ

. di .71429959 + (150^2)*(.00004146) + (0^2)*(.23819397) + (0^2)*(.12721584) + 2*150*(-.00521365) + 2*0*(.02260223) + 2*0*( -.1034968) + 2*0*(-.00064703) + 2*0*(.00035585) + 2*0*0*(.05320001)

.08305459

Page 4: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

4

xi: logit low lwt i.race, nologi.race _Irace_1-3 (naturally coded; _Irace_1 omitted)

Logistic regression Number of obs = 189LR chi2(3) = 11.41Prob > chi2 = 0.0097

Log likelihood = -111.62955 Pseudo R2 = 0.0486

------------------------------------------------------------------------------low | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------lwt | -.0152231 .0064393 -2.36 0.018 -.0278439 -.0026023

_Irace_2 | 1.081066 .4880512 2.22 0.027 .1245034 2.037629_Irace_3 | .4806033 .3566733 1.35 0.178 -.2184636 1.17967

_cons | .8057535 .8451625 0.95 0.340 -.8507345 2.462241------------------------------------------------------------------------------

. vce

Covariance matrix of coefficients of logit model

e(V) | lwt _Irace_2 _Irace_3 _cons -------------+------------------------------------------------

lwt | .00004146_Irace_2 | -.00064703 .23819397_Irace_3 | .00035585 .05320001 .12721584

_cons | -.00521365 .02260223 -.1034968 .71429959

. di (-.0152231*150)+(1.081066*0)+(.4806033*0) + .8057535-1.4777115

. di exp(-1.4777115)/(1+exp(-1.4777115))

.18577333

. prvalue, x(lwt=150 _Irace_2=0 _Irace_3=0)logit: Predictions for lowConfidence intervals by delta method

95% Conf. IntervalPr(y=1|x): 0.1858 [ 0.1003, 0.2713]Pr(y=0|x): 0.8142 [ 0.7287, 0.8997]

lwt _Irace_2 _Irace_3x= 150 0 0

pp

pp

XXXαe

XXXαe

p

...1

...ˆ

2211

2211

ftvIraceIracelwtageαe

ftvIraceIracelwtageαe

p54321

54321

3__2__1

3__2__ˆ

ความนาจะเปนในการเกดทารกนาหนกนอยกวากาหนดเมอ lwt=150, ผวขาว

Confidence Interval Estimation

-Estimate confidence of p

)pse(ZpittrueCI α/i ˆˆlog)%1(100 2

)ˆ(ˆ 2/

)ˆ(2/ˆ

1)%1(100 pseZp

e

e

epofCI

pseZp

. do "I:\cat2011\95ci_p_logit.do"

. di (exp(-1.4777115-((abs(invnormal(0.025)))*sqrt(.08305459))))/(1+(exp(-1.4777115-((abs(invnormal(0.025)))*sqrt(.08305459)))))

.11480659

. di (exp(-1.4777115+((abs(invnormal(0.025)))*sqrt(.08305459))))/(1+(exp(-1.4777115+((abs(invnormal(0.025)))*sqrt(.08305459)))))

.28641379

Interpretation of the fitted model: odds ratio

- ตวแปร Dichotomous - ม 2 ระดบหรอ 2 กลม

Two independent variablesx1 code 0,1 ,and Fixed Value of x2; or Adjusted x2

22

22

22

221

221

221

221

221

221

221

1

1],0|0Pr[1],0|0[

,11

],0|1[

,1

1],1|1Pr[1],1|0[

,11

],1|1[

2121

)0(

)0(

21

)1(2121

)1(

)1(

21

x

x

x

x

x

x

x

x

x

x

exxyxxyP

e

e

e

exxyP

exxyxxyP

e

e

e

exxyP

1221221

22221

22

221

ee

eee

e

bc

ador

xx

xxx

x

dxxyxxyP

cxxyP

bxxyxxyP

axxyP

],0|0Pr[1],0|0[

,],0|1[

,],1|1Pr[1],1|0[

,],1|1[

2121

21

2121

21

221

221

1 x

x

e

e

221 )1(1

1xe

22

22

1 x

x

e

e

221

1xe

a bc d

a

b

c

d

ตวอยาง ในการวเคราะห multiple logistic regressionsmoke, age ตองการแปลผล odds ratio ตวแปร smoke โดย Adjusted age

age

age

age

age

age

age

age

age

age

age

e

agesmokelowagesmokelowPe

e

e

eagesmokelowP

exxyagesmokelowP

e

e

e

eagesmokelowP

2

2

2

21

21

21

21

21

21

21

1

1

],0|0Pr[1],0|0[11

],0|1[

1

1],1|1Pr[1],1|0[

11],1|1[

)0(

)0(

)1(21

)1(

)1(

Page 5: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

5

12121

221

2

21

ee

eee

e

bc

ador

ageage

ageageage

age

dagesmokelowagesmokelowP

cagesmokelowP

bagesmokelowagesmokelowP

aagesmokelowP

],0|0Pr[1],0|0[

,],0|1[

,],1|1Pr[1],1|0[

,],1|1[

age

age

e

e21

21

1

agee 21 )1(1

1

age

age

e

e2

2

1

agee 21

1

a bc d

ดงนนการคานวณ odds ratio ในสมการ logistic regression

-เรยกวา Adjusted odds ratio

ตวอยาง เมอให smoke=1 เปนตวแปรทตองการศกษา

- ตวแปร age เปนตวแปรควบคม- ตวแปร age มคาเทากน ในแตละกลมทศกษา

ORadjustediβe

iOR

การคานวณ odds ratio จากสมการ logistic regression

-วดระดบความสมพนธ

-คาทได เปนคาทควบคมผลจากตวแปรทกตวเรยกวา

Adjusted odds ratio

ตวแปรตาม DExposure (E)

Control (C)

Control (C) Control...

ความหมาย odds ratio จากสมการ logistic regression

-เมอควบคมผลจากปจจย Ci การสมผสปจจย E มความเสยง

ตอการเกด D เปน OR เทาของการไมไดสมผสปจจย E

. logit low smoke age, or

Iteration 0: log likelihood = -117.336Iteration 1: log likelihood = -113.66733Iteration 2: log likelihood = -113.63815Iteration 3: log likelihood = -113.63815

Logistic regression Number of obs = 189LR chi2(2) = 7.40Prob > chi2 = 0.0248

Log likelihood = -113.63815 Pseudo R2 = 0.0315

------------------------------------------------------------------------------low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------smoke | 1.997405 .642777 2.15 0.032 1.063027 3.753081

age | .9514394 .0304194 -1.56 0.119 .8936481 1.012968------------------------------------------------------------------------------

ความหมาย odds ratio จากสมการ logistic regression

-เมอควบคมอาย การสบบหร มความเสยง ตอการเกด

ทารกนาหนกตวนอย เปน 1.997 เทาของการไมสบบหร

ตวแปร Polychotomous

-ตวแปรอสระทมระดบหรอจานวนกลม > 2 กลม

-สรางตวแปรหน (dummy variables) k-1 ตวแปร

ตวอยาง กรณมตวแปรม k ระดบ สรางตวแปรหน

ไดเทากบ k-1 ตวแปร (k=ระดบ, กลม)

level/ ตวแปรหน (dummy variable)

group code D1 D2

code=1 0 0

code=2 1 0

code=3 0 1

Reference Cell

การเปรยบเทยบ code=2 VS code=1, code=3 VS code=1

three independent variables- x1 code 0,1 ,2 - and Fixed Value of x2 ,x3; or Adjusted x2,x3

,1

1

],,1|1[1],,1|0[

,1

1],,1|1[

33221

33221

33221

33221

33221

)1(

321321

)1(

)1(

321

xx

xx

xx

xx

xx

e

xxxyPxxxyPe

e

e

exxxyP

= a

= b

Page 6: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

6

3322

3322

3322

33221

33221

1

1

],,0|0[1],,0|0[

,1

1],,0|1[

321321

)0(

)0(

321

xx

xx

xx

xx

xx

e

xxxyPxxxyPe

e

e

exxxyP

= c

= d

33221

33221

1 xx

xx

e

e

33221 )1(1

1xxe

3322

3322

1 xx

xx

e

e

33221

1xxe

a bc d

13322133221

332233221

22

33221

ee

eee

e

bc

ador

xxxx

xxxxx

xx

,1

1

],3__,,,1_|1[

54321

54321

54321

54321

3__

3__

3__)1(

3__)1(

ftvIracelwtage

ftvIracelwtage

ftvIracelwtage

ftvIracelwtage

e

e

e

e

ftvIracelwtageIraceyP

= a

ตวอยาง ในการวเคราะห multiple logistic regressionage, lwt, i.rece (_Irace_2) , ftv ; ตองการแปลผล odds ratio _Irace_2 แสดงวา Adjusted age, lwt, i.rece (_Irace_3) , ftv ,

1

1

],3__,,,12__|1[1

],3__,,,12__|0[

54321 3__ ftvIracelwtagee

ftvIracelwtageIraceyP

ftvIracelwtagelraceyP

,1

1

],3__,,,0_|1[

5432

5432

54321

54321

3__

3__

3__)0(

3__)0(

ftvIracelwtage

ftvIracelwtage

ftvIracelwtage

ftvIracelwtage

e

e

e

e

ftvIracelwtageIraceyP

= b

= c

,1

1

],3__,,,12__|1[1

],3__,,,02__|0[

5432 3__ ftvIracelwtagee

ftvIracelwtageIraceyP

ftvIracelwtagelraceyP

= d

ftvIracelwtage

ftvIracelwtage

e

e54321

54321

3__

3__

1

ftvIracelwtagee 54321 3__1

1

ftvIracelwtage

ftvIracelwtage

e

e5432

5432

3__

3__

1

ftvIracelwtagee 5432 3__1

1

a bc d

1

5432

54321

3__

3__

e

e

e

bc

ador

ftvIracelwtage

ftvIracelwtage

Page 7: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

7

ดงนนการคานวณ odds ratio ในสมการ logistic regression

-เรยกวา Adjusted odds ratio

ตวอยาง เมอให _Irace_2 (ผวดา) เปนตวแปรทตองการศกษา

- ตวแปร AGE, lwt, _Irace_3 (ผวอนๆ) ตวแปร age, lwt,

_Irace_3 (ผวอนๆ), ftv เปนตวแปรควบคม

- ตวแปร age, lwt, _Irace_3 (ผวอนๆ), ftv มคาเทากน

ในแตละกลมทศกษา

ORadjustediβe

iOR

. xi: logit low age lwt i.race ftv,ori.race _Irace_1-3 (naturally coded; _Irace_1 omitted)

Iteration 0: log likelihood = -117.336Iteration 1: log likelihood = -111.41656Iteration 2: log likelihood = -111.28677Iteration 3: log likelihood = -111.28645

Logit estimates Number of obs = 189LR chi2(5) = 12.10Prob > chi2 = 0.0335

Log likelihood = -111.28645 Pseudo R2 = 0.0516

------------------------------------------------------------------------------low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | .9764586 .0329355 -0.71 0.480 .9139936 1.043193lwt | .9858564 .0064482 -2.18 0.029 .9732989 .9985759

_Irace_2 | 2.728898 1.358603 2.02 0.044 1.028513 7.240436_Irace_3 | 1.542043 .5585894 1.20 0.232 .7581543 3.13643

ftv | .9518876 .1591923 -0.29 0.768 .6858544 1.321111------------------------------------------------------------------------------

การแปลความหมาย odds ratio: กรณตวแปรตอเนอง

-การเปลยนแปลง 1 หนวย is not clinically interesting

เชนอายเพมขน 1 ป หรอความดนโลหตเพมขน 1 mm.Hg

-การเปลยนแปลงควรเปน 5 , 10,…

-หรอตรงกนขาม x มคา 0-1 หนวย การเปลยนแปลง 1 หนวย

เปนคามากไป การเพม 0.01 อาจมความเหมาะสมกวา

-วธการคานวณ odds ratio กรณตวแปรตอเนองดงน

)(βc

ecOR )]ˆ([

1ˆ[

)(%952/ secZβc

ecORofCI

การแปลความหมาย odds ratio: Change in Odds or Percent

. listcoeflogit (N=189): Factor Change in Odds Odds of: 1 vs 0

----------------------------------------------------------------------low | b z P>|z| e^b e^bStdX SDofX

-------------+--------------------------------------------------------age | -0.02382 -0.706 0.480 0.9765 0.8814 5.2987lwt | -0.01424 -2.178 0.029 0.9859 0.6469 30.5794

_Irace_2 | 1.00390 2.016 0.044 2.7289 1.4144 0.3454_Irace_3 | 0.43311 1.196 0.232 1.5420 1.2309 0.4796

ftv | -0.04931 -0.295 0.768 0.9519 0.9491 1.0593----------------------------------------------------------------------

. listcoef, percentlogit (N=189): Percentage Change in Odds Odds of: 1 vs 0

----------------------------------------------------------------------low | b z P>|z| % %StdX SDofX

-------------+--------------------------------------------------------age | -0.02382 -0.706 0.480 -2.4 -11.9 5.2987lwt | -0.01424 -2.178 0.029 -1.4 -35.3 30.5794

_Irace_2 | 1.00390 2.016 0.044 172.9 41.4 0.3454_Irace_3 | 0.43311 1.196 0.232 54.2 23.1 0.4796

ftv | -0.04931 -0.295 0.768 -4.8 -5.1 1.0593----------------------------------------------------------------------

ตวกวนและอตรกรยา (Confounding & Interaction)

interaction อนตรกรยา (ศพยคณตศาสตร, ราชบณฑตยสถาน, 2547)

การปฏสมพนธ (สวทช.)

-การพจารณาตวกวน เรยกวา “delta-beta-hat-percent” ( )

suggest >20%.

ตวอยาง การพจารณาตวกวนและอตรกรยา ใน 3 ลกษณะ

-No Statistical adjustment or interaction

-Statistical adjustment but no statistical interaction

-Statistical adjustment & interaction

100ˆ

)ˆˆ(%ˆ x

Full

Fullreduce

%

is the coefficient from the smaller model

is the coefficient from the larger model.full

reducei

ˆ

ˆ

. logit FRACTURE PRIORFRAC…------------------------------------------------------------------------------

FRACTURE | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

PRIORFRAC | 1.06383 .2230811 4.77 0.000 .6265986 1.50106_cons | -1.416651 .1304641 -10.86 0.000 -1.672356 -1.160946

------------------------------------------------------------------------------. mat b1=e(b)

. logit FRACTURE PRIORFRAC HEIGHT

...------------------------------------------------------------------------------

FRACTURE | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

PRIORFRAC | 1.013009 .2253996 4.49 0.000 .571234 1.454784HEIGHT | -.0453531 .0173787 -2.61 0.009 -.0794146 -.0112915_cons | 5.894665 2.795904 2.11 0.035 .4147926 11.37454

------------------------------------------------------------------------------. mat b2=e(b). di ((b1[1,1]-b2[1,1])/b1[1,1])*1004.7771187

. gen PH=PRIORFRAC*HEIGHT

. logit FRACTURE PRIORFRAC HEIGHT PH

...------------------------------------------------------------------------------

FRACTURE | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

PRIORFRAC | -3.055134 5.790416 -0.53 0.598 -14.40414 8.293873HEIGHT | -.0544845 .0218529 -2.49 0.013 -.0973153 -.0116537

PH | .0253921 .0361138 0.70 0.482 -.0453896 .0961739_cons | 7.361277 3.510281 2.10 0.036 .4812532 14.2413

------------------------------------------------------------------------------

100ˆ

)ˆˆ(%ˆ x

Full

Fullreduce

No Statistical adjustment or interaction

Page 8: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

8

Statistical adjustment but no statistical interaction. logit MYOPIC GENDER ...------------------------------------------------------------------------------

MYOPIC | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

GENDER | .3664706 .2403654 1.52 0.127 -.104637 .8375781_cons | -2.083007 .1792488 -11.62 0.000 -2.434328 -1.731685

------------------------------------------------------------------------------. mat b1=e(b)

. logit MYOPIC GENDER SPHEQ …

------------------------------------------------------------------------------MYOPIC | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------GENDER | .5580827 .2850533 1.96 0.050 -.0006116 1.116777SPHEQ | -3.844609 .4171492 -9.22 0.000 -4.662206 -3.027011_cons | -.2260938 .2527147 -0.89 0.371 -.7214055 .269218

------------------------------------------------------------------------------. mat b3=e(b). di ((b1[1,1]-b3[1,1])/b1[1,1])*100-52.285816

. gen gs=GENDER*SPHEQ

. logit MYOPIC GENDER SPHEQ gs…------------------------------------------------------------------------------

MYOPIC | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

GENDER | .4915759 .4156574 1.18 0.237 -.3230975 1.306249SPHEQ | -3.948287 .6353135 -6.21 0.000 -5.193479 -2.703096

gs | .1850561 .8421906 0.22 0.826 -1.465607 1.835719_cons | -.1910971 .2999269 -0.64 0.524 -.778943 .3967487

------------------------------------------------------------------------------

100ˆ

)ˆˆ(%ˆ x

Full

Fullreduce

Statistical adjustment & interaction. logit FRACTURE PRIORFRAC...------------------------------------------------------------------------------

FRACTURE | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

PRIORFRAC | 1.06383 .2230811 4.77 0.000 .6265986 1.50106_cons | -1.416651 .1304641 -10.86 0.000 -1.672356 -1.160946

------------------------------------------------------------------------------. mat b1=e(b)

. logit FRACTURE PRIORFRAC AGE…------------------------------------------------------------------------------

FRACTURE | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

PRIORFRAC | .838835 .2341556 3.58 0.000 .3798985 1.297772AGE | .0411928 .0121788 3.38 0.001 .0173228 .0650629

_cons | -4.214295 .8478396 -4.97 0.000 -5.87603 -2.55256------------------------------------------------------------------------------. mat b3=e(b). di ((b1[1,1]-b3[1,1])/b1[1,1])*10021.149487

. gen PA=PRIORFRAC*AGE

. logit FRACTURE PRIORFRAC HEIGHT PA…------------------------------------------------------------------------------

FRACTURE | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

PRIORFRAC | 4.961339 1.81022 2.74 0.006 1.413372 8.509305AGE | .0625149 .0154607 4.04 0.000 .0322124 .0928173PA | -.057382 .0250141 -2.29 0.022 -.1064087 -.0083553

_cons | -5.689421 1.08408 -5.25 0.000 -7.814179 -3.564662------------------------------------------------------------------------------

100ˆ

)ˆˆ(%ˆ x

Full

Fullreduce

วธการคานวณทาไดโดย

1) การสรางโมเดลทประกอบดวยตวแปร interaction แยกตามกลมตวแปรเสยง

2) คานวณความแตกตางระหวาสองโมเดล

3) ให exponential คาทไดในขอ 2 ดงน

ถาให f เปนตวแปรเสยง และ x เปนตวแปร covariate

เขยนสมการแบบ logit ดงน

xfxfxfg ii 3210),(

การคานวณ odds ratio กรณม interactionการคานวณ odds ratio กรณม interaction

เมอตวแปร independent = 2 ตวแปร

)()(exp 013011 ffxffor

fxxfp

pyxfg 3211

ln),(

F=f =1= risk factor f=0 reference , x = covariate

Confidence interval

xXFForESZx ,0,1(lnˆˆˆ2/31

)ˆ,ˆ(ˆ2)ˆ(ˆ)ˆ(ˆ

,0,1(ˆlnˆ

2132

1 voxCraVxraV

xXFFROraV

1. สรางโมเดล logit 2 โมเดล

2. คานวณความแตกตางของ 2 โมเดล

3.ให exponential คาทไดในขอ 2

))(0(ˆ)(ˆ)0(ˆˆ),0(

),(

))(1(ˆ)(ˆ)1(ˆˆ),1(

),(

3210

0320100

3210

1321101

ageageageagepriorfracg

xfxfxfg

ageageageagepriorfracg

xfxfxfg

))(ˆˆexp( 31 ageor

)(ˆˆ

)](ˆˆ[)])(1(ˆ)(ˆ)1(ˆˆ[(

)],0(),1([)],(),([

31

203210

01

age

ageageage

ageagepriorfracageagepriorfracgxfgxfg

logit fracture priorfrac age pa, noheader nolog------------------------------------------------------------------------------

fracture | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

priorfrac | 4.961339 1.81022 2.74 0.006 1.413372 8.509305age | .0625149 .0154607 4.04 0.000 .0322124 .0928173pa | -.057382 .0250141 -2.29 0.022 -.1064087 -.0083553

_cons | -5.689421 1.08408 -5.25 0.000 -7.814179 -3.564662------------------------------------------------------------------------------

. vceCovariance matrix of coefficients of logit model

| fracture e(V) | priorfrac age pa _cons

-------------+------------------------------------------------fracture |

priorfrac | 3.2768972age | .01663278 .00023903pa | -.04491676 -.00023903 .0006257

_cons | -1.1752302 -.01663278 .01663278 1.1752302

. di exp(4.961339 + (-.057382*(68.562)))2.7929945

. lincom priorfrac + pa*(68.562) , or( 1) [fracture]priorfrac + 68.562*[fracture]pa = 0------------------------------------------------------------------------------

fracture | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

(1) | 2.792988 .6784603 4.23 0.000 1.734998 4.496133------------------------------------------------------------------------------

Ex. Priorfrac (1=yes, 0=no; age = continuous (years)

Page 9: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

9

Modeling Strategy

Two goals of mathematical modeling

(1) To obtain a valid estimate of an explanatory variables

and response variable relationship

(2) To obtain a good predictive model

Different strategies for difference goals

-Prediction goal -> use computer algorithms

forward selection, backward elimination, stepwise, all possible

-Validity goals -> for etiologic research, standard computer

algorithms do not appropriate because the roles that

variables - such as confounder & effect modifiers (interaction)

Modeling Building Strategies Guidelines & Method for

Logistic Regression

Variable Selection “Most parsimonious model”

-minimizing the number of variables in the model

-Model is more likely to be numerically stable

-More easily generalized

การเลอกแบบเจาะจง (purposeful selection)

ขนตอนสาหรบสาหรบการวเคราะห logistic regression model

(Hosmer & Lameshow, 2000; 2013; Bursac, 2008)

Step 1:-A univariable analysis of each independent variable.

-A careful univariable analysis of each variable

-Any variable whose univariable test has a p-value of

less than 0.25 should be included in the first

multivariable model.

Step 2: -Fit a multivariable model containing all covariates

identified for inclusion at step 1 and

-to assess the importance of each covariate using the

p-value of its Wald statistic.

-Variables that do not contribute at traditional levels of

significance should be eliminated & a new model fit.

-The newer, smaller model should be compared to the

old, larger model using the partial likelihood ratio test.

Step 3: Compare the values of the estimated coefficients in the

smaller model to their respective values from the large

model.

-Any variable whose coefficient has changed markedly in magnitude

should be added back into the model as it is important in the sense of

providing a needed adjustment of the effect of the variables that remain

in the model.

-Cycle through steps 2 and 3 until it appears that all of the important

variables are included in the model and those excluded are clinically

and/or statistically unimportant.

- Hosmer et al. use the "delta-beta-hat-percent" as a measure of the

change in magnitude of the coefficients. They suggest a significant

change as > 20%.

is the coefficient from the smaller model and

is the coefficient from the larger model.

100ˆ

)ˆˆ(%ˆ x

Full

Fullreduce

full

reducei

ˆ

ˆ

%

Step 4: Add each variable not selected in Step 1 to the model

obtained at the end of step 3, one at a time, and check its

significance either by the Wald statistic p-value or the

partial likelihood ratio test

if it is a categorical variable with more than 2

levels. This step is vital for identifying variables that, by

themselves, are not significantly related to the outcome but

make an important contribution in the presence of other

variables. Refer to the model at the end of Step 4 as the

preliminary main effects model.

Page 10: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

10

Step 5: Once we have obtained a model that we feel contains the

essential variables, we examine more closely the variables

in the model. Rrefer to the model at the end of Step 5 as

the main effects model

Step 6: Once we have the main effects model, we check for

interactions among the variables in the model.

Only consider the statistical significance of interactions

and as such, they must contribute to the model at

traditional levels, such as 5% or even 1%.

Step 7: Before any model becomes the final model we must

assess its adequacy and check its fit.

Fit univariate with each variable

Create data set with covariate where p < p-value input

Fit Multivariate model each variable

Identity max p-value is max p <& p-value

Remove variable Test association with each variable

Originally not select

(include preliminary model variable)

no yes

Identity max

Is > change Beta

Reduce the model using A and Byes

Keep variable

Evaluate next variable

no

Delete variableFinal Main Effect Model

การเลอกแบบเจาะจง (purposeful selection)

Bursac, et al. (2008)

The modeling strategy involves three stages:

(Kleinbaum & Klein, 2002)

(1) variable specification,

(2) interaction assessment, and

(3) confounding assessment followed by consideration

of precision.

ตวอยาง การวเคราะหขอมล University of Massachusetts Aids

Research Unit (UMARU) Impact Study (UIS)

id Id number age Age at Enrollment beck Beck Depression Score ivhx IV Drug Use History (1=never

2=previous 3=recent) ndrugtx Number of Prior Drug Txrace Subject’s Race

(0=white 1=other) treat Tx Randomization

(0=short 1=long) site Tx Site (0=A,1=B) dfree Returned to Drug Use

(1=remained 0=otherwise)

Step 1: -A univariable analysis of each independent variable .

-Any variable whose univariable test has a p-value of

less than 0.25 should be included in the first

multivariable model.

- A careful univariable analysis of each variable

- Univariable logistric regression (y=0,1) กบตวแปรอสระ

ทกตวแปร

- ตวแปร nominal , ordinal Scale ทาใหเปนตวแปรหน วเคราะห

ดวย univariable logistic regression พจารณาคาสถต Wald test,

likelihood ratio หรอวเคราะหตารางการณจรดวยสถต likelihood

ratio Chi-Square, Pearson Chi-Square

- ตวแปร continuous วเคราะหดวย univariable logistic regression

พจารณาคาสถต Wald test, likelihood ratio test

- Univariable analysis ( crude analysis) พบวา p-value <.25

(Hosmer & Lemeshow 2000: p.118; p-value 0.15-0.20)

- ตวแปรทนามาสรางในโมเดล มความสาคญ (clinically biological

meaningful) /มเหตผล

Page 11: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

11

. logit dfree

Iteration 0: log likelihood = -326.86446Logistic regression Number of obs = 575

LR chi2(0) = -0.00Prob > chi2 = .

Log likelihood = -326.86446 Pseudo R2 = -0.0000------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

_cons | -1.068691 .095599 -11.18 0.000 -1.256061 -.88132------------------------------------------------------------------------------

. estimates store A

. logit dfree age Iteration 0: log likelihood = -326.86446Iteration 1: log likelihood = -326.16602Iteration 2: log likelihood = -326.16544Logistic regression Number of obs = 575

LR chi2(1) = 1.40Prob > chi2 = 0.2371

Log likelihood = -326.16544 Pseudo R2 = 0.0021------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .0181723 .015344 1.18 0.236 -.0119014 .048246_cons | -1.660226 .5110844 -3.25 0.001 -2.661933 -.6585194

------------------------------------------------------------------------------

. logit dfree age, or

...Logistic regression Number of obs = 575

LR chi2(1) = 1.40Prob > chi2 = 0.2371

Log likelihood = -326.16544 Pseudo R2 = 0.0021------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | 1.018338 .0156254 1.18 0.236 .9881691 1.049429------------------------------------------------------------------------------

. estimates store B

. lrtest A B Likelihood-ratio test LR chi2(1) = 1.40(Assumption: A nested in B) Prob > chi2 = 0.2371

. lincom 10*age,or

( 1) 10 age = 0------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

(1) | 1.199282 .184018 1.18 0.236 .887795 1.620055------------------------------------------------------------------------------

*** odds ratio for a 10 point increase in BECK

. logit dfree beck…Logistic regression Number of obs = 575

LR chi2(1) = 0.64Prob > chi2 = 0.4250

Log likelihood = -326.54621 Pseudo R2 = 0.0010------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

beck | -.008225 .0103428 -0.80 0.426 -.0284965 .0120464_cons | -.9272829 .2003166 -4.63 0.000 -1.319896 -.5346696

------------------------------------------------------------------------------

. estimates store C

. lrtest A C

Likelihood-ratio test LR chi2(1) = 0.64(Assumption: A nested in C) Prob > chi2 = 0.4250

. logit dfree beck, or…Logistic regression Number of obs = 575

LR chi2(1) = 0.64Prob > chi2 = 0.4250

Log likelihood = -326.54621 Pseudo R2 = 0.0010------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

beck | .9918087 .010258 -0.80 0.426 .9719057 1.012119------------------------------------------------------------------------------

. lincom 5*beck,or

( 1) 5 beck = 0

------------------------------------------------------------------------------dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------(1) | .959709 .0496302 -0.80 0.426 .8672027 1.062083

------------------------------------------------------------------------------5555

*** odds ratio for a 5 point increase in BECK

. logit dfree ndrugtx…Logistic regression Number of obs = 575

LR chi2(1) = 11.84Prob > chi2 = 0.0006

Log likelihood = -320.94485 Pseudo R2 = 0.0181------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

ndrugtx | -.0749582 .024681 -3.04 0.002 -.123332 -.0265844_cons | -.7677805 .130326 -5.89 0.000 -1.023215 -.5123462

------------------------------------------------------------------------------

. logit dfree ndrugtx, or

...Logistic regression Number of obs = 575

LR chi2(1) = 11.84Prob > chi2 = 0.0006

Log likelihood = -320.94485 Pseudo R2 = 0.0181------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

ndrugtx | .9277822 .0228986 -3.04 0.002 .8839701 .9737658------------------------------------------------------------------------------. estimates store D. lrtest A DLikelihood-ratio test LR chi2(1) = 11.84(Assumption: A nested in D) Prob > chi2 = 0.0006

. xi:logit dfree i.ivhxi.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)…Logistic regression Number of obs = 575

LR chi2(2) = 13.35Prob > chi2 = 0.0013

Log likelihood = -320.18821 Pseudo R2 = 0.0204------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

_Iivhx_2 | -.4810199 .2657063 -1.81 0.070 -1.001795 .0397548_Iivhx_3 | -.7748382 .2165765 -3.58 0.000 -1.19932 -.3503561

_cons | -.6797242 .1417395 -4.80 0.000 -.9575285 -.4019198------------------------------------------------------------------------------

. xi:logit dfree i.ivhx, ori.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)...Logistic regression Number of obs = 575

LR chi2(2) = 13.35Prob > chi2 = 0.0013

Log likelihood = -320.18821 Pseudo R2 = 0.0204------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

_Iivhx_2 | .6181526 .164247 -1.81 0.070 .3672198 1.040556_Iivhx_3 | .4607783 .0997937 -3.58 0.000 .301399 .7044372

------------------------------------------------------------------------------. estimates store E. lrtest A ELikelihood-ratio test LR chi2(2) = 13.35(Assumption: A nested in E) Prob > chi2 = 0.0013

Page 12: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

12

. logit dfree race

...Logistic regression Number of obs = 575

LR chi2(1) = 4.62Prob > chi2 = 0.0315

Log likelihood = -324.55269 Pseudo R2 = 0.0071------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

race | .4591026 .2109763 2.18 0.030 .0455967 .8726085_cons | -1.193922 .1141504 -10.46 0.000 -1.417653 -.9701919

------------------------------------------------------------------------------

. logit dfree race, or

...Logistic regression Number of obs = 575

LR chi2(1) = 4.62Prob > chi2 = 0.0315

Log likelihood = -324.55269 Pseudo R2 = 0.0071------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

race | 1.582653 .3339022 2.18 0.030 1.046652 2.393145------------------------------------------------------------------------------

. estimates store F

. lrtest A F

Likelihood-ratio test LR chi2(1) = 4.62(Assumption: A nested in F) Prob > chi2 = 0.0315

. logit dfree treat

...Logistic regression Number of obs = 575

LR chi2(1) = 5.18Prob > chi2 = 0.0229

Log likelihood = -324.27534 Pseudo R2 = 0.0079------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

treat | .437162 .1930633 2.26 0.024 .0587649 .8155591_cons | -1.297816 .143296 -9.06 0.000 -1.578671 -1.016961

------------------------------------------------------------------------------

. logit dfree treat, or

...Logistic regression Number of obs = 575

LR chi2(1) = 5.18Prob > chi2 = 0.0229

Log likelihood = -324.27534 Pseudo R2 = 0.0079------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

treat | 1.548307 .2989212 2.26 0.024 1.060526 2.260439------------------------------------------------------------------------------

. estimates store G

. lrtest A G

Likelihood-ratio test LR chi2(1) = 5.18(Assumption: A nested in G) Prob > chi2 = 0.0229

. logit dfree site

...Logistic regression Number of obs = 575

LR chi2(1) = 1.67Prob > chi2 = 0.1968

Log likelihood = -326.0315 Pseudo R2 = 0.0025------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

site | .2642236 .2034167 1.30 0.194 -.1344658 .662913_cons | -1.15268 .1170732 -9.85 0.000 -1.382139 -.9232202

------------------------------------------------------------------------------

. logit dfree site, or

...Logistic regression Number of obs = 575

LR chi2(1) = 1.67Prob > chi2 = 0.1968

Log likelihood = -326.0315 Pseudo R2 = 0.0025------------------------------------------------------------------------------

dfree | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

site | 1.302419 .2649338 1.30 0.194 .8741828 1.940437------------------------------------------------------------------------------

. estimates store H

. lrtest A H

Likelihood-ratio test LR chi2(1) = 1.67(Assumption: A nested in H) Prob > chi2 = 0.1968

ตาราง การวเคราะห simple logistic regression

0.1971.670.87, 1.941.3020.20340.264site

0.02295.181.06, 2.261.548 0.1931 0.437 treat

0.03154.621.05, 2.391.583 0.2109 0.459 race

0.30, 0.700.460 0.2166 -0.775 ivhx3

0.0013 13.350.37, 1.040.618 0.2657 -0.481 ivhx2

0.000611.840.88, 0.970.9280.0247-0.075ndrgtx

0.42500.640.97, 1.010.9920.0103-0.008beck

0.23711.400.99, 1.051.0180.0153 0.018 age

p valuelikelihood ratio95%CIorseสมประสทธตวแปร

ตวแปร beck ม p value เทากบ 0.426 ดงนนจะตดตวแปร beck

ออกจากการวเคราะห

- พจารณาวาตวแปรใด มมความสาคญ พจารณาจากสถต Wald

- ตวแปรใดทมคา p value > 0.25 จะนาออกจากโมเดล

- อยางไรกตามตวแปรทจากมคา p value > 0.25 แตยงคงไวในโมเดล

เชน พบวาเปนปจจยควบคมทสาคญ หรอมเหตผลอนๆ ทจาเปน

ตองคงตวแปรนนไว

2. Fit a multivariable model containing all covariates

identified for inclusion at step 1 & to assess the importance

of each covariate using the p-value of its Wald statistic.

. use "K:\hosmer_data\logistic\uis.dta", clear

. xi:logit dfree age ndrugtx i.ivhx race treat sitei.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)Iteration 0: log likelihood = -326.86446Iteration 1: log likelihood = -310.17928Iteration 2: log likelihood = -309.62871Iteration 3: log likelihood = -309.62413Iteration 4: log likelihood = -309.62413Logistic regression Number of obs = 575

LR chi2(7) = 34.48Prob > chi2 = 0.0000

Log likelihood = -309.62413 Pseudo R2 = 0.0527------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .0503708 .0173224 2.91 0.004 .0164196 .084322ndrugtx | -.0615121 .0256311 -2.40 0.016 -.1117481 -.0112761

_Iivhx_2 | -.6033296 .2872511 -2.10 0.036 -1.166331 -.0403278_Iivhx_3 | -.732722 .252329 -2.90 0.004 -1.227278 -.2381662

race | .2261295 .2233399 1.01 0.311 -.2116087 .6638677treat | .4425031 .1992909 2.22 0.026 .0519002 .8331061site | .1485845 .2172121 0.68 0.494 -.2771434 .5743125

_cons | -2.405405 .5548058 -4.34 0.000 -3.492805 -1.318006------------------------------------------------------------------------------

Page 13: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

13

- พจารณาคา p value จากสถต Ward ของตวแปรทกๆ ตวแปร

- พบวาตวแปร race ม p value เทากบ 0.311

- ตวแปร site ม p value เทากบ 0.494

- เนองจากตวแปร race เปนตวแปรทจากการศกษาพบวาเปน

ปจจยตองควบคมทสาคญ และตวแปร site เปนตวแปรสม

ของพนททศกษา ถงแมวา p value > 0.25 จะยงคงตวแปร

ทงสองไวในโมเดล

* การพจารณาคา p-value ในทนพจารณาจากสถต wald กรณท

ขอมลในแตละกลมตวแปรตามและจานวนตวแปรในโมเดล

ไมเหมาะสม สถตทแนะนาใหใชไดแก likelihood ratio

- การพจารณาตวกวน (confounding) มอทธตอ ตวแปรอน

มากนอยเพยงใด

- พจารณาจาก คาสมประสทธทเปลยนไป เรยกวา

“delta-beta-hat-percent” หรอ

(Hosmer, et al. 2003, 2013)

- a significant change as a delta-beta-hat-percent > 20%.

100ˆ

)ˆˆ(%ˆ x

Full

Fullreduce

%

Step 3: Compare the values of the estimated coefficients in the

smaller model to their respective values from the large

model.

- มการศกษาและแนะนาใหใช คาทเปลยนแปลงไปของคาประมาณ

ของผล (change in effect estimates) เชน odds ratio

- เนองจากคาสมประสทธเปนคาทอยในรปของ log odds ratio

และยงไมมความชดเจนของคาสมประสทธทเปลยนแปลงไป

วามความหมายอยางไร (Kleinbaum & Klein, 2002,p 215)

- คาทเปลยนแปลงไปของ odds ratio แนะนาใชคาท 10%

มแนวโนมวาตวแปรนนมอทธพลกบตวแปร main effect

(Kleinbaum, Kupper, Morgenstern, 1982;

Greenland,1989; Mickey & Greenland, 1989)

* การศกษาของ Bechand & Hosmer (1999) การพจารณาคาทเปลยนแปลงไป

ของสมประสทธ ไมไดบอกเสมอไปวาตวแปรนนๆ เปนตวแปรกวน

- Hosmer et al. (2000, 2013) ใช 100ˆ

)ˆˆ(%ˆ x

full

Fullreduce

. xi:logit dfree age ndrugtx i.ivhx race treati.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)...Logistic regression Number of obs = 575

LR chi2(6) = 34.02Prob > chi2 = 0.0000

Log likelihood = -309.8567 Pseudo R2 = 0.0520------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .0509605 .017309 2.94 0.003 .0170354 .0848856ndrugtx | -.0631998 .0256525 -2.46 0.014 -.1134778 -.0129219

_Iivhx_2 | -.5928725 .2864333 -2.07 0.038 -1.154272 -.0314735_Iivhx_3 | -.7600441 .2489941 -3.05 0.002 -1.248064 -.2720245

race | .2081089 .221453 0.94 0.347 -.2259309 .6421488treat | .438959 .1991429 2.20 0.028 .0486461 .829272_cons | -2.355786 .5501049 -4.28 0.000 -3.433972 -1.2776

------------------------------------------------------------------------------

- ในทนนา site ออกไป (เปนตวอยางการคานวณ เทานนเนองจาก

site เปนตวแปรสาคญ)

-0.800920.438960.44250treat

--0.14859site

-7.969150.208110.22613race

3.72885-0.76004-0.73272_Iivhx_3

-1.73323-0.59287-0.60333_Iivhx_2

2.74369-0.06320-0.06151ndrugtx

1.170720.050960.05037age

Delta beta hat (%)Reduce modelFull modelVariables

100ˆ

ˆˆ)ˆ(%

mod

modmod xhatBetaDeltaelfull

elfullelreduce

- เมอ <20% สามารถ remove ตวแปรนนออกได%

-0.487

-1.10361.053052,

2.278566

1.549016.1969050.4376198-ivhx2

-0.354

-0.8011.049849,

2.291650

1.551092.1991429 .4389590 -site

1.290

2.8971.067954,

2.327755

1.576685.1987689.4553245-race

-1.115

-2.5351.043943,

2.269520

1.539236.1981066 .4312865-ndrugtx

-3.035

-6.9641.025092,

2.222400

1.509359.1974029 .4116853-Age

1.053271,

2.300453

1.556599.1992909.4425031Adjusted all

95% CI

odds ratio

Odds ratiostandard

error

สมประสทธ

ตวแปร treat

Reduce

ตวแปร EE

คาสมประสทธทเปลยนไป /คา effect estimate ทเปลยนไป ของตวแปร main effect

Page 14: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

14

. xi:logit dfree age ndrugtx i.ivhx race treat site becki.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)Iteration 0: log likelihood = -326.86446Iteration 1: log likelihood = -310.17972Iteration 2: log likelihood = -309.62533Iteration 3: log likelihood = -309.6238Iteration 4: log likelihood = -309.6238Logistic regression Number of obs = 575

LR chi2(8) = 34.48Prob > chi2 = 0.0000

Log likelihood = -309.6238 Pseudo R2 = 0.0527------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .0504143 .0174058 2.90 0.004 .0162995 .084529ndrugtx | -.0615329 .0256457 -2.40 0.016 -.1117975 -.0112682

_Iivhx_2 | -.6036962 .2875987 -2.10 0.036 -1.167379 -.0400131_Iivhx_3 | -.7336591 .2549904 -2.88 0.004 -1.233431 -.2338871

race | .2260262 .2233692 1.01 0.312 -.2117694 .6638218treat | .4424802 .1992933 2.22 0.026 .0518725 .833088site | .1489209 .2176073 0.68 0.494 -.2775816 .5754234beck | .0002759 .0107983 0.03 0.980 -.0208883 .0214402

_cons | -2.411128 .5983465 -4.03 0.000 -3.583866 -1.238391------------------------------------------------------------------------------

Step 4: Add each variable not selected in Step 1 to the model)Step 5: Examine more closely the variables in the model.

Refer to the model at the end of Step 5 as

the main effects model

ตรวจสอบ Linearity ตวแปร Continuous

วธตรวจสอบ

-Plot Smoothed logit and continuous variable

-Plot Coefficient and continuous variable โดยแบงตวแปร

continuous variable เปน 4 สวนดวย quartile

-Fractional polynomial

. do "G:\hosmer_data\logistic\plot_smooth_logit_age.do"

. lowess dfree age, gen(var3) logit nodraw

. graph twoway line var3 age, sort xlabel(20(10)50 56)

-Plot Smoothed logit and continuous variable-Plot Coefficient and continuous variable โดยแบงตวแปร continuous variable เปน 4 สวนดวย quartile

.xtile age1 = age, nq(4)

.tabstat age, statistics(median ) by(age1) columns(variables)

Summary statistics: p50by categories of: age1 (4 quantiles of age)

age1 | age---------+----------

1 | 252 | 303 | 354 | 40

---------+----------Total | 32

--------------------

. xi: logit dfree i.age1 ndrugtx i.ivhx race treat site

...Logistic regression Number of obs = 575

LR chi2(9) = 34.69Prob > chi2 = 0.0001

Log likelihood = -309.52103 Pseudo R2 = 0.0531

------------------------------------------------------------------------------dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------_Iage1_2 | -.165864 .2909137 -0.57 0.569 -.7360444 .4043163_Iage1_3 | .4693399 .27066 1.73 0.083 -.0611439 .9998237_Iage1_4 | .595771 .3124964 1.91 0.057 -.0167108 1.208253ndrugtx | -.0587551 .0254688 -2.31 0.021 -.108673 -.0088371

_Iivhx_2 | -.5545193 .2853626 -1.94 0.052 -1.11382 .0047811_Iivhx_3 | -.6725536 .2518601 -2.67 0.008 -1.16619 -.1789169

race | .2787172 .2238499 1.25 0.213 -.1600205 .7174549treat | .4430577 .2000427 2.21 0.027 .0509812 .8351343site | .1582001 .2188293 0.72 0.470 -.2706974 .5870976

_cons | -1.054837 .2705875 -3.90 0.000 -1.585179 -.5244956------------------------------------------------------------------------------

.clear

.input age coefage coef

1. 25 02. 30 -.1658643. 35 .46933994. 40 .5957715. end

.graph twoway scatter coef age, connect(l) ylabel(-.25(.25).75) xlabel(20(10)50) yline(0)

Page 15: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

15

การวเคราะห fractional polynomial

-การสรางโมเดลโดยวธ Fractional Polynomial เปนการสราง

โมเดลระหวางตวแปรผล (outcome) และตวแปรอสระทม

สเกลการวดแบบตอเนองหรอสเกลแบบจดอนดบ นาเสนอโดย

Royston & Altman (1994)

-เมอตวแปรไม linearity หรอไมมความสมพนธเชงเสน

กลาวอกนยหนงคอโมเดลทมความสมพนธแบบไมใชเสนตรง

(non-linearity) ใหปรบเปลยนตวแปรนนดวยคายกกาลง

(power) ใดๆ

-โดยมชอเรยกเชน การสรางสมการแบบ first-order

fractional polynomial หรอ fp1 etc.

การวเคราะห fractional polynomial

-การแปลงคาของ x ใดๆ เปนคา xp ตามทเหมาะสมจาก

คายกกาลง p ดงตอไปน -2, -1, -0.5, 0, 0.5, 1, 2, 3

-เมอ p=0 คา xp คอคาของ log x ดงนนการปรบเปลยนใน

กลมนมไดทงหมด 8 รปแบบ

-การสรางสมการแบบ second-order fractional polynomial

หรอ fp2 เปนการแปลงคาของ x ใด เปนคา xp ตามทเหมาะสม

จากคายกกาลง p เปนคๆ การปรบเปลยนในกลมนมได

ทงหมด 72 รปแบบ

33100.5-1

320.500-1

2200-0.5-1

313-0.5-1-1

212-0.53-23

111-0.52-22

30.50.5-0.51-21

20.50-0.50.5-20.5

10.5-0.5-0.50-20

0.50.53-1-0.5-2-0.5

302-1-1-2-1

201-1-2-2-2

p2P1P2p1p2p1p

powerpowerpowerPower

FP2FP1

Power of First &

second-order

fractional polynomial

First order (FP1) p=8

Second order (FP2) p=72

วธการเปลยนรปตวแปรตอเนองโดยวธ Fractional Polynomial

โดยการสรางสเกล (Scaling) และหรอ การเปลยนรป

โดยการปรบจากคากลาง (center)

-การปรบเปลยนตวแปรตอเนองโดยวธ Fractional Polynomial

กรณทตวแปรตอเนองไมมลกษณะเชงเสน สามารถกาหนดไดหลายวธ

เชน

การเปลยนรป (transform) โดยการสรางสเกล (Scaling) และหรอ

การเปลยนรปโดยการปรบจากคากลาง (center)

วธ Fractional Polynomial โดยการสรางสเกล (Scaling)

-สามารถทาไดหลายวธ เชนการสรางสเกลโดยใชโปรแกรม STATA

มการกาหนดดงน

lrange = log10[max(x) - min(x)]

scale = 10sign(lrange)int(|lrange|)

x∗ = x/scale

วธ Fractional Polynomial โดยการปรบจากคากลาง (center)

-เชนการเปลยนรปตวแปร ใชสญลกษณ

-กรณเปลยนรปแบบ FP1 ดงนนจากโมเดล

เปลยนรปเปน เมอ

*1x

*1x1x

11ˆ xy oi

)*)(ˆ 1*1

*1

*0

ppi xxy

n

ix

nx

1 1

1*

การวเคราะห fractional polynomial

-การเลอกโมเดลใดๆ พจารณาจาก คาความแตกตางของ

Deviance ระหวางโมเดลทใชในการวเคราะห ดงน

-คาความแตกตางของ Deviance ประมาณไดกบการแจกแจงแบบ

Chi-Square ท df= df(model2)-df(model1)

)},()({2),(,( 211211 ppLpLpppG

)}()1({2),1( 11 pLLpG (df=1)

(df=2)

Page 16: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

16

การวเคราะห fractional polynomial

-การเลอกโมเดลใดๆ พจารณาจาก คาความแตกตางของ

Deviance ระหวางโมเดลทใชในการวเคราะห ดงน

-คาความแตกตางของ Deviance ประมาณไดกบการแจกแจงแบบ

Chi-Square ท df = df(model 2)-df(model 1)

-อยางไรกตามการวเคราะหโดยใช fractional polynomial ทาให

การแปลผลยงยาก วธแกไขโดยการจดกลมตวแปร ตอเนองอยาง

เหมาะสม โดยศกษาจากทฤษฎ การศกษาวจย การใช cut point ดวย

Median, Quartile ตองพงระมดระวงสาหรบ การจดกลมกบตวแปร

ตอเนอง อาจใหเกดขอสรปทคาดเคลอนได

Heinzl H.,2000; Royston P., Altman D.G., Sauerbrei W., 2006)

Fractional polynomial. use "H:\hosmer_data\logistic\uis.dta", clear. xi:fracpoly logit dfree age ndrugtx i.ivhx race treat site,degree(2)comparei.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)-> gen double Indru__1 = ndrugtx-4.542608696 if e(sample)........-> gen double Iage__1 = X^-2-.0953622163 if e(sample)-> gen double Iage__2 = X^3-33.95748331 if e(sample)

(where: X = age/10)Iteration 0: log likelihood = -326.86446…Iteration 4: log likelihood = -309.38436Logistic regression Number of obs = 575

LR chi2(8) = 34.96Prob > chi2 = 0.0000

Log likelihood = -309.38436 Pseudo R2 = 0.0535------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

Iage__1 | -1.538626 4.575934 -0.34 0.737 -10.50729 7.43004Iage__2 | .0116581 .0080977 1.44 0.150 -.0042132 .0275293

Indru__1 | -.0620596 .0257223 -2.41 0.016 -.1124744 -.0116447_Iivhx_2 | -.6057376 .2881578 -2.10 0.036 -1.170517 -.0409587_Iivhx_3 | -.7263554 .2525832 -2.88 0.004 -1.221409 -.2313014

race | .2282107 .224089 1.02 0.308 -.2109957 .6674171treat | .4392589 .1996983 2.20 0.028 .0478573 .8306604site | .1459101 .217491 0.67 0.502 -.2803644 .5721846

_cons | -1.082342 .2416317 -4.48 0.000 -1.555931 -.6087524------------------------------------------------------------------------------Deviance: 618.77. Best powers of age among 44 models fit: -2 3.

1. เมอ power=1 หรอ age เปน linear เมอเปรยบเทยบ age อยในโมเดลกบ

ไมม age ในโมเดล (p-value=.003;df=1-0)

2. เมอเปรยบเทยบ age กบ (age-2 และ age3) พบวาไม significant

(Dev. dif.=619.248-618.769= 0.480; p-value=0.923, df=4-1)

3. เปรยบเทยบ age3 กบ (age

-2 และ age

3) พบวาไม significant

(Dev. dif.=618.882-618.769=0.133,p-value=0.945;df=4-2)

First order (FP1) Second order (FP2)

Fractional polynomial model comparisons:---------------------------------------------------------------age df Deviance Dev. dif. P (*) Powers---------------------------------------------------------------Not in model 0 627.801 9.032 0.060Linear 1 619.248 0.480 0.923 1m = 1 2 618.882 0.114 0.945 3m = 2 4 618.769 -- -- -2 3---------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 2 model

. di chiprob(4-1,619.248-618.769)

.9234802

. di chiprob(4-2,618.882-618.769)

.94506648

การพจารณาวาโมเดลใดๆ ดกวา linear model

ใน Fractional polynomial

G(1,(p1, p

2) = -2{L(1) - L(p

1, p

2)}

= 619.248 - 618.769 = 0.480; p-value = 0.923

เลอก linear model

Fractional polynomial model comparisons:---------------------------------------------------------------age df Deviance Dev. dif. P (*) Powers---------------------------------------------------------------Not in model 0 627.801 9.032 0.060Linear 1 619.248 0.480 0.923 1m = 1 2 618.882 0.114 0.945 3m = 2 4 618.769 -- -- -2 3---------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 2 model

G(1,p1) = -2{L(1) - L(p1)}=619.248-618.882=.366;p-value=.545

STATA10

First order m=1 (FP1) Second order m=2 (FP2)

. use "I:\hosmer_data\logistic\uis.dta", clear

. xi:fracpoly logit dfree age ndrugtx i.ivhx race treat site,degree(1)comparei.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)-> gen double Indru__1 = ndrugtx-4.542608696 if e(sample)-> gen double Iage__1 = X^3-33.95748331 if e(sample)

(where: X = age/10)Iteration 0: log likelihood = -326.86446...------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

Iage__1 | .0138939 .0046486 2.99 0.003 .0047829 .023005Indru__1 | -.0620649 .0257325 -2.41 0.016 -.1124997 -.0116301_Iivhx_2 | -.5960999 .2868616 -2.08 0.038 -1.158338 -.0338615_Iivhx_3 | -.714141 .2499592 -2.86 0.004 -1.204052 -.22423

race | .2355037 .2230028 1.06 0.291 -.2015736 .6725811treat | .4348659 .1992503 2.18 0.029 .0443425 .8253893site | .1436801 .2173756 0.66 0.509 -.2823683 .5697285

_cons | -1.113293 .2236989 -4.98 0.000 -1.551734 -.6748509------------------------------------------------------------------------------Deviance: 618.88. Best powers of age among 8 models fit: 3.Fractional polynomial model comparisons:---------------------------------------------------------------age df Deviance Dev. dif. P (*) Powers---------------------------------------------------------------Not in model 0 627.801 8.918 0.012Linear 1 619.248 0.366 0.545 1m = 1 2 618.882 -- -- 3---------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 1 model

. di chiprob(2-1,619.248-618.882)

.54519273

. xi:fracpoly logit dfree age ndrugtx i.ivhx race treat site, degree(2) comparei.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)-> gen double Indru__1 = ndrugtx-4.542608696 if e(sample)........-> gen double Iage__1 = X^-2-.0953622163 if e(sample)-> gen double Iage__2 = X^3-33.95748331 if e(sample)

(where: X = age/10)Iteration 0: log likelihood = -326.86446Iteration 1: log likelihood = -309.95259Iteration 2: log likelihood = -309.38924Iteration 3: log likelihood = -309.38436Iteration 4: log likelihood = -309.38436Logistic regression Number of obs = 575

LR chi2(8) = 34.96Prob > chi2 = 0.0000

Log likelihood = -309.38436 Pseudo R2 = 0.0535------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

Iage__1 | -1.538626 4.575934 -0.34 0.737 -10.50729 7.43004Iage__2 | .0116581 .0080977 1.44 0.150 -.0042132 .0275293

Indru__1 | -.0620596 .0257223 -2.41 0.016 -.1124744 -.0116447_Iivhx_2 | -.6057376 .2881578 -2.10 0.036 -1.170517 -.0409587_Iivhx_3 | -.7263554 .2525832 -2.88 0.004 -1.221409 -.2313014

race | .2282107 .224089 1.02 0.308 -.2109957 .6674171treat | .4392589 .1996983 2.20 0.028 .0478573 .8306604site | .1459101 .217491 0.67 0.502 -.2803644 .5721846

_cons | -1.082342 .2416317 -4.48 0.000 -1.555931 -.6087524------------------------------------------------------------------------------Deviance: 618.77. Best powers of age among 44 models fit: -2 3.

STATA10

Page 17: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

17

Fractional polynomial model comparisons:---------------------------------------------------------------age df Deviance Dev. dif. P (*) Powers---------------------------------------------------------------Not in model 0 627.801 9.032 0.060Linear 1 619.248 0.480 0.923 1m = 1 2 618.882 0.114 0.945 3m = 2 4 618.769 -- -- -2 3---------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 2 model

ตวแปร ndrugtx. lowess dfree ndrugtx , gen(var2) logit nodraw. graph twoway line var2 ndrugtx, sort xlabel(20(10)50 56)

. xi:fracpoly logit dfree ndrugtx age i.ivhx race treat site, degree(2) comparei.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)-> gen double Iage__1 = age-32.3826087 if e(sample)........-> gen double Indru__1 = X^-1-1.804204581 if e(sample)-> gen double Indru__2 = X^-1*ln(X)+1.064696882 if e(sample)

(where: X = (ndrugtx+1)/10)…------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

Indru__1 | .981453 .2888487 3.40 0.001 .4153199 1.547586Indru__2 | .3611251 .1098594 3.29 0.001 .1458047 .5764455Iage__1 | .0544455 .0174877 3.11 0.002 .0201702 .0887208

_Iivhx_2 | -.6088269 .2911069 -2.09 0.036 -1.179386 -.0382679_Iivhx_3 | -.7238122 .2555649 -2.83 0.005 -1.22471 -.2229142

race | .2477026 .2242156 1.10 0.269 -.1917519 .6871571treat | .4223666 .2003655 2.11 0.035 .0296574 .8150759site | .1732142 .2209763 0.78 0.433 -.2598915 .6063198

_cons | -1.164471 .2454825 -4.74 0.000 -1.645608 -.6833343------------------------------------------------------------------------------Deviance: 613.45. Best powers of ndrugtx among 44 models fit: -1 -1.

Fractional polynomial model comparisons:---------------------------------------------------------------ndrugtx df Deviance Dev. dif. P (*) Powers---------------------------------------------------------------Not in model 0 626.176 12.725 0.013Linear 1 619.248 5.797 0.122 1m = 1 2 618.818 5.367 0.068 .5m = 2 4 613.451 -- -- -1 -1---------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 2 model

ตวแปร ndrugtx. xi:mfp logit dfree ndrugtx age i.ivhx race treat site i.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)

Deviance for model with all terms untransformed = 619.248, 575 observations

Variable Model (vs.) Deviance Dev diff. P Powers (vs.)----------------------------------------------------------------------age lin. FP2 619.248 0.480 0.923 1 -2 3

Final 619.248 1

[_Iivhx_3 included with 1 df in model]

ndrugtx lin. FP2 619.248 5.797 0.122 1 -1 -1Final 619.248 1

[treat included with 1 df in model]

[_Iivhx_2 included with 1 df in model]

[race included with 1 df in model]

[site included with 1 df in model]

ใชคาสง Multivariate Fractional Multinomial (mfp)

Fractional polynomial fitting algorithm converged after 1 cycle.

Transformations of covariates:

-> gen double Indru__1 = ndrugtx-4.542608696 if e(sample) -> gen double Iage__1 = age-32.3826087 if e(sample)

Final multivariable fractional polynomial model for dfree--------------------------------------------------------------------

Variable | -----Initial----- -----Final-----| df Select Alpha Status df Powers

-------------+------------------------------------------------------ndrugtx | 4 1.0000 0.0500 in 1 1

age | 4 1.0000 0.0500 in 1 1_Iivhx_2 | 1 1.0000 0.0500 in 1 1_Iivhx_3 | 1 1.0000 0.0500 in 1 1

race | 1 1.0000 0.0500 in 1 1treat | 1 1.0000 0.0500 in 1 1site | 1 1.0000 0.0500 in 1 1

--------------------------------------------------------------------

Logistic regression Number of obs = 575LR chi2(7) = 34.48Prob > chi2 = 0.0000

Log likelihood = -309.62413 Pseudo R2 = 0.0527

------------------------------------------------------------------------------dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------Indru__1 | -.0615121 .0256311 -2.40 0.016 -.1117481 -.0112761Iage__1 | .0503708 .0173224 2.91 0.004 .0164196 .084322

_Iivhx_2 | -.6033296 .2872511 -2.10 0.036 -1.166331 -.0403278_Iivhx_3 | -.732722 .252329 -2.90 0.004 -1.227278 -.2381662

race | .2261295 .2233399 1.01 0.311 -.2116087 .6638677treat | .4425031 .1992909 2.22 0.026 .0519002 .8331061site | .1485845 .2172121 0.68 0.494 -.2771434 .5743125

_cons | -1.053693 .2264488 -4.65 0.000 -1.497524 -.6098613------------------------------------------------------------------------------Deviance: 619.248.

Page 18: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

18

Step 6: Check for interactions among the variables in the

model. Only consider the statistical significance of

interactions and as such, they must contribute to

the model at traditional levels, such as 5% or even 1%.

การสรางตวแปรอตรกรยา (interaction)

-สรางโมเดลทประกอบดวย Main Effect (จาก step 5)

-สรางตวแปรประกอบเปน interaction order from ทสงกวา

ตองมตวแปร ใน order ทตากวา เรยกวา

“Hierarchically Well-formated Model (HWL)”

-เชนเมอม third order term

logit P(X) = x1

+ x2

+ x3

+ x1*x

2+x

1*x

3+ x

2*x

3+ x

1*x

2*x

3

logit P(X) = x1

+ x2

+ x3

+ x2*x

3+ x

1*x

2*x

3(ไมถกตอง)

Interaction assessment

-วเคราะหโดยใช Z-test (Wald)

-วเคราะหโดยใช Likelihood ratio test

)ln(ln2:

)ln2()ln2(:

)ˆ(

ˆ:

fullreduced

fullreduced

i

ij

RLR

LRLRtestLR

seZtestWald

0:;0:1

3030

213220

HH

xxxxy

. gen tage= treat* age

. logit dfree treat age tage

Iteration 0: log likelihood = -326.86446Iteration 1: log likelihood = -322.31165Iteration 2: log likelihood = -322.26464Iteration 3: log likelihood = -322.26464

Logistic regression Number of obs = 575LR chi2(3) = 9.20Prob > chi2 = 0.0268

Log likelihood = -322.26464 Pseudo R2 = 0.0141

------------------------------------------------------------------------------dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------treat | -1.123388 1.042136 -1.08 0.281 -3.165936 .9191606

age | -.0077915 .0238604 -0.33 0.744 -.0545571 .0389741tage | .0480969 .0314183 1.53 0.126 -.0134819 .1096756

_cons | -1.043996 .7884888 -1.32 0.185 -2.589406 .5014138------------------------------------------------------------------------------

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3

. di "Log likelihood = " e(ll)Log likelihood = -306.72558. estimates store A. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 age_ndrugfp1

. estimates store B1

. di "Log likelihood = " e(ll)Log likelihood = -302.87416. lrtest A B1 Likelihood-ratio test LR chi2(1) = 7.70(Assumption: A nested in B1) Prob > chi2 = 0.0055. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 age_ndrugfp2

. estimates store B2

. di "Log likelihood = " e(ll)Log likelihood = -303.03684. lrtest A B2 Likelihood-ratio test LR chi2(1) = 7.38(Assumption: A nested in B2) Prob > chi2 = 0.0066. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ageivhx2

. estimates store C

. di "Log likelihood = " e(ll)Log likelihood = -306.36027. lrtest A CLikelihood-ratio test LR chi2(1) = 0.73(Assumption: A nested in C) Prob > chi2 = 0.3927

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ageivhx3

. estimates store D

. di "Log likelihood = " e(ll)Log likelihood = -306.68672. lrtest A DLikelihood-ratio test LR chi2(1) = 0.08(Assumption: A nested in D) Prob > chi2 = 0.7804

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 agerace

. estimates store E

. di "Log likelihood = " e(ll)Log likelihood = -306.6269. lrtest A ELikelihood-ratio test LR chi2(1) = 0.20(Assumption: A nested in E) Prob > chi2 = 0.6569. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 agetreat

. estimates store F

. di "Log likelihood = " e(ll)Log likelihood = -305.34312. lrtest A FLikelihood-ratio test LR chi2(1) = 2.76(Assumption: A nested in F) Prob > chi2 = 0.0964

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 agesite

. estimates store G

. di "Log likelihood = " e(ll)Log likelihood = -305.92657. lrtest A GLikelihood-ratio test LR chi2(1) = 1.60(Assumption: A nested in G) Prob > chi2 = 0.2062. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp1_ivhx2

. estimates store F11

. di "Log likelihood = " e(ll)Log likelihood = -305.61857. lrtest A F11Likelihood-ratio test LR chi2(1) = 2.21(Assumption: A nested in F11) Prob > chi2 = 0.1368. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp1_ivhx3

. estimates store F12

. di "Log likelihood = " e(ll)Log likelihood = -306.66329. lrtest A F12Likelihood-ratio test LR chi2(1) = 0.12(Assumption: A nested in F12) Prob > chi2 = 0.7241

Page 19: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

19

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp1_race

. estimates store F13

. di "Log likelihood = " e(ll)Log likelihood = -306.32029. lrtest A F13Likelihood-ratio test LR chi2(1) = 0.81(Assumption: A nested in F13) Prob > chi2 = 0.3679. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp1_treat

. estimates store F14

. di "Log likelihood = " e(ll)Log likelihood = -305.28879. lrtest A F14Likelihood-ratio test LR chi2(1) = 2.87(Assumption: A nested in F14) Prob > chi2 = 0.0900. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp2_ivhx2

. estimates store F21

. di "Log likelihood = " e(ll)Log likelihood = -305.43893. lrtest A F21Likelihood-ratio test LR chi2(1) = 2.57(Assumption: A nested in F21) Prob > chi2 = 0.1087

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp2_ivhx3

. estimates store F22

. di "Log likelihood = " e(ll)Log likelihood = -306.71318. lrtest A F22Likelihood-ratio test LR chi2(1) = 0.02(Assumption: A nested in F22) Prob > chi2 = 0.8749. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp2_race

. estimates store F23

. di "Log likelihood = " e(ll)Log likelihood = -306.49124. lrtest A F23Likelihood-ratio test LR chi2(1) = 0.47(Assumption: A nested in F23) Prob > chi2 = 0.4936. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 ndrugfp2_treat

. estimates store F24

. di "Log likelihood = " e(ll)Log likelihood = -305.25896. lrtest A F24Likelihood-ratio test LR chi2(1) = 2.93(Assumption: A nested in F24) Prob > chi2 = 0.0868. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 raceivhx2

. estimates store M

. di "Log likelihood = " e(ll)Log likelihood = -306.27003. lrtest A MLikelihood-ratio test LR chi2(1) = 0.91(Assumption: A nested in M) Prob > chi2 = 0.3398

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 raceivhx3

. estimates store N

. di "Log likelihood = " e(ll)Log likelihood = -306.04202. lrtest A NLikelihood-ratio test LR chi2(1) = 1.37(Assumption: A nested in N) Prob > chi2 = 0.2423. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 racetreat

. estimates store O

. di "Log likelihood = " e(ll)Log likelihood = -306.25412. lrtest A OLikelihood-ratio test LR chi2(1) = 0.94(Assumption: A nested in O) Prob > chi2 = 0.3315. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 racesite

. estimates store P

. di "Log likelihood = " e(ll)Log likelihood = -302.45334. lrtest A PLikelihood-ratio test LR chi2(1) = 8.54(Assumption: A nested in P) Prob > chi2 = 0.0035

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 treativhx2

. estimates store Q

. di "Log likelihood = " e(ll)Log likelihood = -306.70711. lrtest A QLikelihood-ratio test LR chi2(1) = 0.04(Assumption: A nested in Q) Prob > chi2 = 0.8476. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 treativhx3

. estimates store R

. di "Log likelihood = " e(ll)Log likelihood = -306.72555. lrtest A RLikelihood-ratio test LR chi2(1) = 0.00(Assumption: A nested in R) Prob > chi2 = 0.9941. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 treatsite

. estimates store S

. di "Log likelihood = " e(ll)Log likelihood = -306.70871. lrtest A SLikelihood-ratio test LR chi2(1) = 0.03(Assumption: A nested in S) Prob > chi2 = 0.8543

. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 siteivhx2

. estimates store T

. di "Log likelihood = " e(ll)Log likelihood = -306.63454. lrtest A TLikelihood-ratio test LR chi2(1) = 0.18(Assumption: A nested in T) Prob > chi2 = 0.6696. qui logit dfree age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 siteivhx3

. estimates store U

. di "Log likelihood = " e(ll)Log likelihood = -306.30032. lrtest A ULikelihood-ratio test LR chi2(1) = 0.85(Assumption: A nested in U) Prob > chi2 = 0.3564

0.136812.21-305.61857ndrugfp1 x_Iivhx_2

0.206211.60-305.92657age x site

0.096412.76-305.34312age x treat

0.656910.20-306.6269age x race

0.780410.73-306.36027age x _Iivhx_3

0.392710.73-306.36027age x _Iivhx_2

0.006617.38-303.03684age x ndrugfp2

0.032114.59-307.32665 age x ndrugfp1

-309.62413 โมเดล main effect

P valuedfGLog likelihoodinteraction

Page 20: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

20

0.242311.37-306.04202race x _Iivhx_3

0.339810.91-306.27003race x _Iivhx_2

0.956310.003-306.72408ndrugfp2 x site

0.086812.93-305.25896ndrugfp2 x treat

0.493610.47-306.49124ndrugfp2 x race

0.136812.21-306.71318ndrugfp2 x _Iivhx_3

0.108712.57-305.43893ndrugfp2 x _Iivhx_2

0.958610.002-306.72423ndrugfp1 x site

0.090012.87-305.28879ndrugfp1 x treat

0.367910.81-306.32029ndrugfp1 x race

0.724110.12-306.66329ndrugfp1 x _Iivhx_3

P valuedfGLog likelihoodinteraction

0.356410.85-306.30032site x _Iivhx_3

0.669610.18-306.63454site x _Iivhx_2

0.854310.03-306.70871treat x site

0.99411.00005-306.72555treat x _Iivhx_3

0.847610.04-306.70711treat x _Iivhx_2

0.003518.54-302.45334race x site

0.331510.94-306.70871race x treat

P valuedfGLog likelihoodinteraction

การพจารณาตวแปร interaction เขาในโมเดล พจารณา p-value

ทระดบนยสาคญท 0.10 ประกอบดวยตวแปร age*ndrugfp1,

Age*ndrugfp2, age*treat, ndrugfp1*treat, race*site

. xi:logit dfree age ndrugfp1 ndrugfp2 race treat site i.ivhx age_ndrugfp1 age_ndrugfp2 agetreat ndrugfp1_treat racesite

i.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)…Logistic regression Number of obs = 575

LR chi2(13) = 59.70Prob > chi2 = 0.0000

Log likelihood = -297.01266 Pseudo R2 = 0.0913------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .1265633 .0749318 1.69 0.091 -.0203003 .2734269ndrugfp1 | 2.526072 1.596445 1.58 0.114 -.6029022 5.655047ndrugfp2 | .744546 .6043186 1.23 0.218 -.4398966 1.928989

race | .6905325 .2667743 2.59 0.010 .1676645 1.213401treat | -.6315667 1.236548 -0.51 0.610 -3.055156 1.792022site | .4927464 .2565283 1.92 0.055 -.0100398 .9955326

_Iivhx_2 | -.6062768 .3000219 -2.02 0.043 -1.194309 -.0182447_Iivhx_3 | -.6767542 .2629918 -2.57 0.010 -1.192209 -.1612997

age_ndrugfp1 | -.0382959 .046105 -0.83 0.406 -.12866 .0520682age_ndrugfp2 | -.0088257 .0177002 -0.50 0.618 -.0435174 .025866

agetreat | .04181 .0338704 1.23 0.217 -.0245747 .1081947ndrugfp1_t~t | -.077965 .0731851 -1.07 0.287 -.2214052 .0654751

racesite | -1.34883 .5353968 -2.52 0.012 -2.398188 -.2994715_cons | -7.458669 2.69444 -2.77 0.006 -12.73968 -2.177663

------------------------------------------------------------------------------

ตวแปร age x ndrugfp2 มคาสถต ward เทากบ -0.50 และ p value = 0.618

มากทสดใหนาตวแปรนออกจากโมเดล วเคราะหโมเดลใหม

. xi:logit dfree age ndrugfp1 ndrugfp2 race treat site i.ivhx age_ndrugfp1 agetreat ndrugfp1_treat racesite

i.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)…Logistic regression Number of obs = 575

LR chi2(12) = 59.46Prob > chi2 = 0.0000

Log likelihood = -297.1368 Pseudo R2 = 0.0909------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .0935724 .0349887 2.67 0.007 .0249957 .162149ndrugfp1 | 1.76005 .414738 4.24 0.000 .9471787 2.572922ndrugfp2 | .4497105 .1176999 3.82 0.000 .2190229 .680398

race | .6782462 .2654307 2.56 0.011 .1580116 1.198481treat | -.6081425 1.238307 -0.49 0.623 -3.03518 1.818895site | .4852708 .2561725 1.89 0.058 -.0168181 .9873598

_Iivhx_2 | -.6123348 .2997954 -2.04 0.041 -1.199923 -.0247467_Iivhx_3 | -.6811741 .2627758 -2.59 0.010 -1.196205 -.166143

age_ndrugfp1 | -.0155282 .0061056 -2.54 0.011 -.0274949 -.0035615agetreat | .0412879 .0339743 1.22 0.224 -.0253006 .1078764

ndrugfp1_t~t | -.0783593 .0732363 -1.07 0.285 -.2218998 .0651812racesite | -1.333799 .53443 -2.50 0.013 -2.381263 -.2863356

_cons | -6.319684 1.403203 -4.50 0.000 -9.069911 -3.569457------------------------------------------------------------------------------

ตวแปร ndrugfp1 x treat มคาสถต ward เทากบ -1.07 และ p value = 0.285 มากทสด

ในโมเดลน ใหนาตวแปรนออกจากโมเดล วเคราะหโมเดลใหม

. xi:logit dfree age ndrugfp1 ndrugfp2 race treat site i.ivhx age_ndrugfp1 agetreat racesite

i.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)…Logistic regression Number of obs = 575

LR chi2(11) = 58.31Prob > chi2 = 0.0000

Log likelihood = -297.71139 Pseudo R2 = 0.0892------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .0889238 .0339956 2.62 0.009 .0222937 .1555539ndrugfp1 | 1.705601 .4106322 4.15 0.000 .9007769 2.510426ndrugfp2 | .4440587 .1175928 3.78 0.000 .213581 .6745364

race | .6869266 .265402 2.59 0.010 .1667483 1.207105treat | -1.252787 1.080874 -1.16 0.246 -3.371262 .8656875site | .4903829 .2560081 1.92 0.055 -.0113838 .9921497

_Iivhx_2 | -.6299072 .2994363 -2.10 0.035 -1.216792 -.0430227_Iivhx_3 | -.694879 .262544 -2.65 0.008 -1.209456 -.1803021

age_ndrugfp1 | -.0155328 .0060924 -2.55 0.011 -.0274737 -.0035918agetreat | .0515973 .0325362 1.59 0.113 -.0121726 .1153672racesite | -1.401606 .5309161 -2.64 0.008 -2.442183 -.3610301

_cons | -5.976921 1.338859 -4.46 0.000 -8.601036 -3.352807------------------------------------------------------------------------------

ตวแปร age x treat มคาสถต wald เทากบ 1.59 และ p value = 0.113 มากทสด

ในโมเดลน ใหนาตวแปรนออกจากโมเดล วเคราะหโมเดลใหม ** กรณใช p-value 0.25

คงไวในโมเดล

. xi:logit dfree age ndrugfp1 ndrugfp2 race treat site i.ivhx age_ndrugfp1 racesitei.ivhx _Iivhx_1-3 (naturally coded; _Iivhx_1 omitted)…Logistic regression Number of obs = 575

LR chi2(10) = 55.77Prob > chi2 = 0.0000

Log likelihood = -298.98146 Pseudo R2 = 0.0853------------------------------------------------------------------------------

dfree | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .1166385 .0288749 4.04 0.000 .0600446 .1732323ndrugfp1 | 1.669035 .407152 4.10 0.000 .871032 2.467038ndrugfp2 | .4336886 .1169052 3.71 0.000 .2045586 .6628185

race | .6841068 .2641355 2.59 0.010 .1664107 1.201803treat | .4349255 .2037596 2.13 0.033 .035564 .834287site | .516201 .2548881 2.03 0.043 .0166295 1.015773

_Iivhx_2 | -.6346307 .2987192 -2.12 0.034 -1.220109 -.0491518_Iivhx_3 | -.7049475 .2615805 -2.69 0.007 -1.217636 -.1922591

age_ndrugfp1 | -.0152697 .0060268 -2.53 0.011 -.0270819 -.0034575racesite | -1.429457 .5297806 -2.70 0.007 -2.467808 -.3911062

_cons | -6.843864 1.219316 -5.61 0.000 -9.23368 -4.454048------------------------------------------------------------------------------

- พบวาทกตวแปรม p value < 0.05 ทกตวแปร ใหนาโมเดลใหมน

ไปทดสอบความเหมาะสมของสมการตอไป

Page 21: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

21

Step 7: Before any model becomes the final model we must

assess its adequacy and check its fit.

Computation and evaluation of overall measures of fit

- Pearson Chi-Square

- Hosmer-Lameshow Test

- Classification Table

- Area Under the Receiver Operating Characteristic Curve

(ROC)

- Examination of others measure (R2)

Logistic Regression Diagnostics

Assessment of fit via External validation

Next---> for Detailed

ขอพงระวงในการวเคราะห logistic regression

-ภาวะรวมเสนตรงหรอภาวะรวมเสนตรงพห: ความสมพนธ

ระหวางตวแปรอสระสง (collinearity or multicollinearity)

ทาให coefficient เปลยนแปลง

การแกปญหา Ridge logistic regression, พจารณาตดตวแปร,

สรางตวแปรใหม

- influential observation (outliers)

- Zero cell or Sparse data

- Problem of perfect or complete separation

- Overdispersion

ภาวะรวมเสนตรง* (Collinearity)

ความสมพนธระหวางตวแปรอสระดวยกน มคาสง

(r2 > 0.90; r > 0.95 Kleinbaum, Muller, Nizam; 1998, 241)

การลดหรอเพมตวแปรในโมเดล ทาใหเปลยนแปลงคาสมประสทธ

ทงขนาดและ/หรอเครองหมาย

คา R2 มคาสงแตการทดสอบทางสถตกบสมประสทธ พบวา

ไมมนยสาคญ

ทาใหคา Standard error สง ซงสงผลใหคาสถตมคาตาเชน t, z

และทาใหคาชวงเชอมนของสมประสทธมคากวาง

*พจนานกรมศพทคณตศาสตร ฉบบราชบณฑตยสถาน, 2552

. twocat 98 1 1 98

. logit y x1 x2 x3------------------------------------------------------------------------------

y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

x1 | .2995896 1.429618 0.21 0.834 -2.502411 3.10159x2 | -.0143819 1.429593 -0.01 0.992 -2.816334 2.78757x3 | .3139715 .2886275 1.09 0.277 -.2517281 .8796711

_cons | -.3670144 .2425088 -1.51 0.130 -.8423228 .1082941------------------------------------------------------------------------------

. logit y x1 x3------------------------------------------------------------------------------

y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

x1 | .2854983 .2861025 1.00 0.318 -.2752523 .8462489x3 | .3136786 .2871556 1.09 0.275 -.249136 .8764931

_cons | -.3670266 .2425058 -1.51 0.130 -.8423293 .1082761------------------------------------------------------------------------------

. logit y x2 x3------------------------------------------------------------------------------

y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

x2 | .279187 .2860666 0.98 0.329 -.2814933 .8398672x3 | .3079032 .2871195 1.07 0.284 -.2548407 .8706471

_cons | -.3612278 .2408548 -1.50 0.134 -.8332944 .1108389------------------------------------------------------------------------------

. corr x1 x2 x3(obs=198)

| x1 x2 x3-------------+---------------------------

x1 | 1.0000x2 | 0.9798 1.0000x3 | 0.0000 0.0203 1.0000

. collin x1 x2 x3Collinearity Diagnostics

SQRT R-Variable VIF VIF Tolerance Squared

----------------------------------------------------x1 25.25 5.03 0.0396 0.9604x2 25.26 5.03 0.0396 0.9604x3 1.01 1.01 0.9897 0.0103

----------------------------------------------------Mean VIF 17.17

CondEigenval Index

---------------------------------1 3.0422 1.00002 0.6908 2.09853 0.2570 3.44084 0.0100 17.4460

---------------------------------Condition Number 17.4460Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)Det(correlation matrix) 0.0396

การตรวจสอบ collinearity หรอ multicollinearity

Pearson Correlation (informal method)

-ตรวจสอบความสมพนธทกตวแปร โดยใชสถต Pearson correlation

พจารณาตวแปรทมความสมพนธกบตวแปรอนๆ สง. corr age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 age_ndrugfp1 racesite

(obs=575)| age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 age_nd~1 racesite

-------------+------------------------------------------------------------------------------------------age | 1.0000

ndrugfp1 | -0.1836 1.0000ndrugfp2 | 0.1601 -0.9916 1.0000

race | 0.0139 0.0874 -0.0821 1.0000treat | -0.0446 0.0251 -0.0204 0.0791 1.0000site | -0.0287 0.1923 -0.1926 -0.0795 -0.0230 1.0000

_Iivhx_2 | 0.1063 -0.0551 0.0567 -0.0152 0.0513 0.1623 1.0000_Iivhx_3 | 0.2674 -0.3045 0.2843 -0.1806 -0.0695 -0.2292 -0.4138 1.0000

age_ndrugfp1 | 0.0462 0.9546 -0.9475 0.1080 0.0108 0.1833 -0.0134 -0.2506 1.0000racesite | 0.0430 0.1831 -0.1834 0.4384 0.0522 0.3849 -0.0303 -0.1295 0.2055 1.0000

Page 22: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

22

Variance Inflation Factors (VIF: formal method)

พจารณาคา VIF > 10 และ

คาเฉลยของ VIF มากกวา 1 มปญหาการเกด multicolinearity. collin age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 age_ndrugfp1 racesiteCollinearity Diagnostics

SQRT R-Variable VIF VIF Tolerance Squared

----------------------------------------------------age 2.64 1.63 0.3782 0.6218

ndrugfp1 105.68 10.28 0.0095 0.9905ndrugfp2 63.77 7.99 0.0157 0.9843

race 1.43 1.20 0.6969 0.3031treat 1.02 1.01 0.9831 0.0169site 1.41 1.19 0.7090 0.2910

_Iivhx_2 1.39 1.18 0.7201 0.2799_Iivhx_3 1.65 1.28 0.6061 0.3939

age_ndrugfp1 27.55 5.25 0.0363 0.9637racesite 1.64 1.28 0.6109 0.3891

----------------------------------------------------Mean VIF 20.82

- Generalized Variance inflaction factor (GVIF)

VIF คานวณอยางไร?

r r2 vif.1 0.01 1.01 .2 0.04 1.04 .3 0.09 1.10 .4 0.16 1.19 .5 0.25 1.33 .6 0.36 1.56 .7 0.49 1.96 .8 0.64 2.78 .9 0.81 5.26 .91 0.83 5.82 .92 0.85 6.51 .93 0.86 7.40 .94 0.88 8.59 .95 0.90 10.26.96 0.92 12.76 .97 0.94 16.92 .98 0.96 25.25 .99 0.98 50.25 1 1.00 .

ความสมพนธระหวาง VIF vs คา correlation

.95

วธ Variance inflation factors

- เพอวดวาความแปรปรวนทประมาณจากคาสมประสทธ

inflated ไปเพยงใดเมอเปรยบเทยบกบการมตวแปรอสระ

ทไมมความสมพนธเชงเสน

1-p

1-p

1i

KVIF

VIF

และ

2

iR1

11)

2

iR(1

iVIF

)2i

R(1i

tolerance

Indication of Multicollinearity ดวยวธ Variance inflation factors*

- VIF > 10 indication that Multicollinearity

- Mean VIF provides information about the severity of the

multicollinearity

- if Mean VIF > 1 are indicative of serious multicollinearity

problems

*Neter, Wasserman, Kutner (1987; p.392)

Marquardt (1970); Belsley, Kuh & Welsch (1980)

- tolerence <0.20 or 0.10 and/or VIF>5 or 10+ (O’Brien, 2007)

Stata

collin [varlist…]estat vif variance inflation factors for the

independent variables

Conditional Index & Variance Decomposition Proportion

คา Conditional Index (CI) และคา Variance Decomposition

Proportion (VDP) เปนคาทคานวณจาก eigenvalue จากการ

วเคราะหเมตรกซสหสมพนธ ของตวแปรอสระ โดย Conditional

Index คานวณจาก

คา Conditional Index มคา 10-30 แสดงวามภาวะรวมเสนตรง

คา conditional index > 30 แสดงวามปญหาภาวะรวมเสนตรง

Conditional Index > 100 แสดงวามภาวะรวมเสนตรงสงมากๆ

(Belsley, 1991a)

between 10 and 30, there is moderate to strong multicollinearity and

if it exceeds 30 there is severe multicollinearity. (Gujarati, 2002)

Eigenvaluek MinMax ;/

Conditional Index & Variance Decomposition Proportion

คา Variance Decomposition Proportion แนะนาโดย

Belsley et al. (1980) และ Belsley (1991a)

พจารณา VDP มากกวา 0.5

คานวณคาสดสวนของความแปรปรวน (proposed calculation of

the proportions of variance) ของแตละตวแปรสมพนธกบ

คาองคประกอบ (principal component) เปรยบเสมอน

องคประกอบของคาสมประสทธความแปรปรวนในแตละมต

(decomposition of the coefficient variance for each dimension)

kj

jkjk VIF

Vp

2

(Fox,1984)

Page 23: logistic multiple 2558u - @@ Home - KKU Web Hosting · 2016. 3. 22. · Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.41656 Iteration 2: log likelihood

23

. collin age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 age_ndrugfp1 racesiteCollinearity Diagnostics

SQRT R-Variable VIF VIF Tolerance Squared

----------------------------------------------------age 2.64 1.63 0.3782 0.6218

ndrugfp1 105.68 10.28 0.0095 0.9905ndrugfp2 63.77 7.99 0.0157 0.9843

race 1.43 1.20 0.6969 0.3031treat 1.02 1.01 0.9831 0.0169site 1.41 1.19 0.7090 0.2910

_Iivhx_2 1.39 1.18 0.7201 0.2799_Iivhx_3 1.65 1.28 0.6061 0.3939

age_ndrugfp1 27.55 5.25 0.0363 0.9637racesite 1.64 1.28 0.6109 0.3891

----------------------------------------------------Mean VIF 20.82

CondEigenval Index

---------------------------------1 5.9439 1.00002 1.2749 2.15923 1.0679 2.35924 1.0129 2.42245 0.7402 2.83386 0.4588 3.59957 0.3110 4.37168 0.1469 6.36209 0.0320 13.628910 0.0094 25.149011 0.0021 52.8408

---------------------------------Condition Number 52.8408Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)Det(correlation matrix) 0.0002

- ตรวจสอบคา conditional index/variance decomposition proportion

- CI มากกวา 30, VDP มากกวาหรอเทากบ .5

. coldiag2 age ndrugfp1 ndrugfp2 race treat site _Iivhx_2 _Iivhx_3 age_ndrugfp1, force w(5)

Condition number using scaled variables = 52.18

Condition Indexes and Variance-Decomposition Proportions

conditionindex _cons age ndr~1 nd~p2 race treat site _Ii~2 _Ii~3 age~1

> 1 1.00 0.00 0.00 0.00 0.00 0.01 0.01 0.01 0.00 0.00 0.002 2.22 0.00 0.00 0.00 0.00 0.00 0.02 0.01 0.00 0.12 0.003 2.38 0.00 0.00 0.00 0.00 0.00 0.01 0.05 0.37 0.03 0.004 2.70 0.00 0.00 0.00 0.00 0.64 0.01 0.14 0.00 0.03 0.005 3.23 0.00 0.00 0.00 0.00 0.20 0.05 0.69 0.14 0.00 0.006 3.56 0.00 0.00 0.00 0.00 0.03 0.80 0.03 0.12 0.06 0.007 6.12 0.02 0.02 0.00 0.00 0.12 0.08 0.06 0.32 0.70 0.008 13.34 0.07 0.06 0.01 0.02 0.00 0.02 0.00 0.03 0.02 0.219 24.83 0.10 0.43 0.03 0.35 0.00 0.00 0.00 0.00 0.03 0.3010 52.18 0.81 0.49 0.96 0.62 0.00 0.00 0.00 0.00 0.01 0.49

. prnt_cx, force w(5)

Condition Indexes and Variance-Decomposition Proportions

conditionindex _cons age ndr~1 nd~p2 race treat site _Ii~2 _Ii~3 age~1

> 1 1.00 . . . . . . . . . . 2 2.22 . . . . . . . . . . 3 2.38 . . . . . . . 0.37 . . 4 2.70 . . . . 0.64 . . . . . 5 3.23 . . . . . . 0.69 . . . 6 3.56 . . . . . 0.80 . . . . 7 6.12 . . . . . . . 0.32 0.70 . 8 13.34 . . . . . . . . . . 9 24.83 . 0.43 . 0.35 . . . . . 0.3010 52.18 0.81 0.49 0.96 0.62 . . . . . 0.49

Variance-Decomposition Proportions less than .3 have been printed as "."

Zero cell or Sparse data

- Exact logistic Regression

- Firth logistic Regression

Problem of perfect or complete separation

- Firth logistic Regression

Binomial Overdispersion

- Scale SEs by Chi2

dispersion.

- Scale iteratively; Williams’ procedure.

- Robust variance estimators.

- Bootstrap or jackknife SE.

- Generalized binomial.

- Parameterize as a rate-count response model.

- Parameterize as a panel model.

- Nested logistic regression.