categorical data analysis week 2. binary response models binary and binomial responses binary: y...

151
Categorical Data Analysis Week 2

Upload: jordan-weaver

Post on 25-Dec-2015

229 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Categorical Data Analysis

Week 2

Page 2: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Binary Response Models binary and binomial responses

binary: y assumes values of 0 or 1 binomial: y is number of “successes” in n “trials”

distributions Bernoulli:

Binomial:

1Pr( | ) (1 )y yy p p p

Pr( | , ) (1 )y n ynp

yy n p p

Page 3: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Transformational Approach linear probability model

use grouped data (events/trials):

“identity” link:

linear predictor:

problems of prediction outside [0,1]

ii

i

yp

n

( )i i iIp x

i i x

Page 4: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

The Logit Model

logit transformation:

inverse logit:

ensures that p is in [0,1] for all values of x and .

logit( ) log1

ii i i

i

p

pp

x

exp(

1 exp(

)( )

)i

i ii

p

Page 5: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

The Logit Model

odds and odds ratios are the key to understanding and interpreting this model

the log odds transformation is a “stretching” transformation to map probabilities to the real line

Page 6: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Odds and Probabilities

0 5 10 15 20 25 30

0.0

0.2

0.4

0.6

0.8

1.0

odds

pro

ba

bili

ty

Page 7: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Probabilities and Log Odds

-6 -4 -2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

log(odds)

pro

ba

bility

Page 8: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

The Logit Transformation properties of logit

-6 -4 -2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

logit

p

linear

Page 9: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Odds, Odds Ratios, and Relative Risk odds of “success” is the ratio:

consider two groups with success probabilities:

odds ratio (OR) is a measure of the odds of success in group 1 relative to group 2

1

p

p

1 2an d p p

1 1 1

2 2 2

/ (1 )

/ (1 )

pp

p p

Page 10: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Odds Ratio

2 X 2 table:

OR is the cross-product ratio (compare x = 1 group to x = 0 group)

odds of y = 1 are 4 times higher when x =1 than when x = 0

50 15

15 20

Y 0 1

0

1 X50

4.4415

20ˆ15

Page 11: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Odds Ratio equivalent interpretation

odds of y = 1 are 0.225 times higher when x = 0 than when x = 1

odds of y = 1 are 1-0.225 = .775 times lower when x = 0 than when x = 1

odds of y = 1 are 77.5% lower when x = 0 than when x = 1

1 15 15ˆ 50

0. 2520

2

Page 12: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Log Odds Ratios

Consider the model:

D is a dummy variable coded 1 if group 1 and 0 otherwise.

group 1: group 2:

LOR: OR:

0logit( )i ip D

0)logit( ip

exp( )

0logit( )ip

Page 13: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Relative Risk

similar to OR, but works with rates

relative risk or rate ratio (RR) is the rate in group 1 relative to group 2

OR RR as .

#Events

Exposure

Dr

R

1

2

RR = r

r

0p

Page 14: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

consider the following data

Page 15: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

read table:

clearinput educ psex f0 0 8730 1 11901 0 5331 1 1208endlabel define edlev 0 "HS or less" 1 "Col or more"label val educ edlevlabel var educ education

Page 16: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios compute odds:

verify by hand

tabodds psex educ [fw=f]

Pr>chi2 = 0.0000Score test for trend of odds: chi2(1) = 55.48

Pr>chi2 = 0.0000Test of homogeneity (equal odds): chi2(1) = 55.48 Col or ~e 1208 533 2.26642 2.04681 2.50959 HS or l~s 1190 873 1.36312 1.24911 1.48753 educ cases controls odds [95% Conf. Interval]

Page 17: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios compute odds ratios:

verify by hand

tabodds psex educ [fw=f], or

Pr>chi2 = 0.0000Score test for trend of odds: chi2(1) = 55.48

Pr>chi2 = 0.0000Test of homogeneity (equal odds): chi2(1) = 55.48 Col or ~e 1.662674 55.48 0.0000 1.452370 1.903429 HS or l~s 1.000000 . . . . educ Odds Ratio chi2 P>chi2 [95% Conf. Interval]

Page 18: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios stat facts:

variances of functions use in statistical significance tests and forming

confidence intervals basic rule for variances of linear transformations

g(x) = a + bx is a linear function of x, then

this is a trivial case of the delta method applied to a single variable

the delta method for the variance of a nonlinear function g(x) of a single variable is

2var[ ] ( )a xb b varx

2

var[ ( )] var((

))g x

xg x x

Page 19: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios stat facts:

variances of odds and odds ratios we can use the delta method to find the variance in the

odds and the odds ratios from the asymptotic (large sample theory) perspective it

is best to work with log odds and log odds ratios the log odds ratio converges to normality at a faster rate

than the odds ratio, so statistical tests may be more appropriate on log odds ratios (nonlinear functions of p)

21

ˆvar(log var( )ˆ ˆ

)(1 )

ˆ pp p

Page 20: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios stat facts:

the log odds ratio is the difference in the log odds for two groups

groups are independent

variance of a difference is the sum of the variances

1 2

ˆ ˆ ˆlog ) var(log ) var(logvar( )

Page 21: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

data structures: grouped or individual level note:

use frequency weights to handle grouped data or we could “expand” this data by the frequency weights

resulting in individual-level data model results from either data structures are the same

expand the data and verify the following results

expand f

Page 22: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

statistical modeling

logit model (glm):

logit model (logit):

logit psex educ [fw=f], or

glm psex educ [fw=f], f(b) eform

Page 23: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios statistical modeling (#1)

logit model (glm):

educ 1.662674 .1138634 7.42 0.000 1.453834 1.901512 psex Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] OIM

Log likelihood = -2477.935675 BIC = -26387.09 AIC = 1.303857

Link function : g(u) = ln(u/(1-u)) [Logit]Variance function: V(u) = u*(1-u) [Bernoulli]

Pearson = 3804 (1/df) Pearson = 1.000526Deviance = 4955.871349 (1/df) Deviance = 1.303491 Scale parameter = 1Optimization : ML Residual df = 3802Generalized linear models No. of obs = 3804

Page 24: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

statistical modeling (#2) some ideas from alternative normalizations

what parameters will this model produce? what is the interpretation of the “constant”

gen cons = 1glm psex cons educ [fw=f], nocons f(b) eform

Page 25: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

statistical modeling (#2)

educ 1.662674 .1138634 7.42 0.000 1.453834 1.901512 cons 1.363116 .0607438 6.95 0.000 1.249111 1.487525 psex Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] OIM

Log likelihood = -2477.935675 BIC = -26387.09 AIC = 1.303857

Link function : g(u) = ln(u/(1-u)) [Logit]Variance function: V(u) = u*(1-u) [Bernoulli]

Pearson = 3804 (1/df) Pearson = 1.000526Deviance = 4955.871349 (1/df) Deviance = 1.303491 Scale parameter = 1Optimization : ML Residual df = 3802Generalized linear models No. of obs = 3804

Page 26: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

statistical modeling (#3)

what parameters does this model produce? how do you interpret them?

gen lowed = educ == 0gen hied = educ == 1glm psex lowed hied [fw=f], nocons f(b) eform

Page 27: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: odds and odds ratios

statistical modeling (#3)

hied 2.266417 .1178534 15.73 0.000 2.046809 2.509586 lowed 1.363116 .0607438 6.95 0.000 1.249111 1.487525 psex Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] OIM

Log likelihood = -2477.935675 BIC = -26387.09 AIC = 1.303857

Link function : g(u) = ln(u/(1-u)) [Logit]Variance function: V(u) = u*(1-u) [Bernoulli]

Pearson = 3804 (1/df) Pearson = 1.000526Deviance = 4955.871349 (1/df) Deviance = 1.303491 Scale parameter = 1Optimization : ML Residual df = 3802Generalized linear models No. of obs = 3804

are these odds ratios?

Page 28: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Tutorial: prediction fitted probabilities (after most recent model)

predict p, mu

tab educ [fw=f], sum(p) nostandard nofreq

Total .63038905 3804 Col or mo .69385409 1741 HS or les .57682985 2063 education Mean Obs. mean psex Summary of predicted

Page 29: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Probit Model

inverse probit is the CDF for a standard normal variable:

link function:

21

21d

2

u

p e u

1)probit( ( )i i ip p

Page 30: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Probit Transformation

-3 -2 -1 0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

probit

p

Page 31: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Interpretation probit coefficients

interpreted as a standard normal variables (no log odds-ratio interpretation)

“scaled” versions of logit coefficients

probit models more common in certain disciplines (economics) analogy with linear regression (normal latent variable) more easily extended to multivariate distributions

probit g t3

lo i

Page 32: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example: Grouped Data Swedish mortality data revisited

_cons -4.017514 .1922715 -20.90 0.000 -4.394359 -3.640669 P2 .5271214 .120775 4.36 0.000 .2904068 .763836 A3 -.8384579 .2006439 -4.18 0.000 -1.231713 -.445203 A2 .1147916 .21511 0.53 0.594 -.3068163 .5363995 y Coef. Std. Err. z P>|z| [95% Conf. Interval] OIM

logit model

_cons -2.101865 .0778879 -26.99 0.000 -2.254522 -1.949207 P2 .2098432 .0472825 4.44 0.000 .1171712 .3025151 A3 -.3247921 .0807731 -4.02 0.000 -.4831045 -.1664797 A2 .0497241 .087904 0.57 0.572 -.1225646 .2220128 y Coef. Std. Err. z P>|z| [95% Conf. Interval] OIM

probit model

Page 33: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Swedish Historical Mortality Data predictions

Logit Probit

A 1 2 A 1 2

1 19.0 10.0 1 19.1 9.92 61.0 32.0 2 61.9 31.63 143.0 60.0 3 141.1 61.4

sum 325 sum 325.1

P P

Page 34: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Programming

Stata: generalized linear model (glm)

glm y A2 A3 P2, family(b n) link(probit)

glm y A2 A3 P2, family(b n) link(logit)

idea of glm is to make model linear in the link. old days: Iteratively Reweighted Least Squares now: Fisher scoring, Newton-Raphson both approaches yield MLEs

Page 35: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Generalized Linear Models applies to a broad class of models

iterative fitting (repeated updating) except for linear model update parameters, weights W, and predicted values m

models differ in terms of W and m and assumptions about the distribution of y

common distributions for y include: normal, binomial, and Poisson

common links include: identity, logit, probit, and log

1( 1) ( )t t t t XW X X y m

Page 36: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Latent Variable Approach example: insect mortality

suppose a researcher exposes insects to dosage levels (u) of an insecticide and observes whether the “subject” lives or dies at that dosage.

the response is expected to depend on the insect’s tolerance (c) to that dosage level.

the insect dies if u > c and survives if u < c

tolerance is not observed (survival is observed)

Pr( 1) Pr( )i i iy u c

Page 37: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Latent Variables u and c are continuous latent variables

examples: women’s employment: u is the market wage and c is the

reservation wage migration: u is the benefit of moving and c is the cost of

moving. observed outcome y =1 or y = 0 reveals the

individual’s preference, which is assumed to maximize a rational individual’s utility function.

Page 38: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Latent Variables Assume linear utility and criterion functions

over-parameterization = identification problem we can identify differences in components but not the

separate components

u uu x

Pr( 1) Pr( ) Pr ( )c u u cy u c x

c cc x

Page 39: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Latent Variables constraints:

Then:

where F(.) is the CDF of ε

u c

Pr( 1) Pr( ) ( )y x F x

c u

Page 40: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Latent Variables and Standardization Need to standardize the mean and variance of ε

binary dependent variables lack inherent scales magnitude of β is only in reference to the mean

and variance of ε which are unknown. redefine ε to a common standard

where a and b are two chosen constants.

* a

b

Page 41: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Standardization for Logit and Probit Models standardization implies

F*() is the cdf of ε*

location a and scale b need to be fixed

setting

and

a b

*Pr( 1)x a

y Fb

*() () probit modelF

Page 42: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Standardization for Logit and Probit Models

distribution of ε is standardized

standard normal probit

standard logistic logit

both distributions have a mean of 0 variances differ

2*probit 1

2

*2logit

3

Page 43: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Extending the Latent Variable Approach observed y is a dichotomous (binary) 0/1 variable

continuous latent variable: linear predictor + residual

observed outcome

*ii iy x

*1 0

0

if

otherwisei

iyy

Page 44: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Notation conditional means of latent variables obtained from

index function:

obtain probabilities from inverse link functions

logit model:

probit model:

*( | )E i iiy x x

( )i i x

( )i i x

Page 45: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

ML likelihood function

where if data are binary

log-likelihood function

( )() )( 1 i ii

n yy

ii iL F F

x x

1in

log ( ) ( ) logl )o (g 1i i i i ii

y F n FL y x x

Page 46: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Assessing Models

definitions: L null model (intercept only): L saturated model (a parameter for each cell): L current model:

grouped data (events/trials) deviance (likelihood ratio statistic)

0L

fL

cL

2 2log 2 log logcc f

f

LG LL

L

Page 47: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Deviance grouped data:

if cell sizes are reasonably large deviance is distributed as chi-square

individual-level data: Lf =1 and log Lf =0 deviance is not a “fit” statistic

2 2log cLG

Page 48: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Deviance

deviance is like a residual sum of squares larger values indicate poorer models larger models have smaller deviance

deviance for the more constrained model (Model 1)

deviance for the less constrained model (Model 2)

assume that Model 1 is a constrained version of Model 2.

21G

22G

Page 49: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Difference in Deviance

evaluate competing “nested” models using a likelihood ratio statistic

model chi-square is a special case

SAS, Stata, R, etc. report different statistics

2 1

2 2 2 21 2 df dfG G G

2 2 20 0Model 2log ( 2log )c cG G L L

Page 50: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Other Fit Statistics BIC & AIC (useful for non-nested models)

basic idea of IC : penalize log L for the number of parameters (AIC/BIC) and/or the size of the sample (BIC)

AIC s=1 BIC s= ½ log n (sample size) dfm is the number of model parameters

I )C 2log 2( )( mL s df

Page 51: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Hypothesis Tests/Inference

single parameter: MLE are asymptotically normal Z-test

multi-parameter: likelihood ratio tests (after fitting) Wald tests (test constraints from current model)

0H : 0

0 1 2 0H :

Page 52: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Hypothesis Tests/Inference Wald test (tests a vector of restrictions)

a set of r parameters are all equal to 0

a set of r parameters are linearly restricted

0H : r 0

0H : r R q

restriction matrix constraint vector

parameter subset

Page 53: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Interpreting Parameters odds ratios: consider the model where x is a

continuous predictor and d is a dummy variable

suppose that d denotes sex and x denotes income and the problem concerns voting, where y* is the propensity to vote

results: logit(pi) = -1.92 + 0.012xi + 0.67di

*0 1 2i i i iy x d

Page 54: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Interpreting Parameters for d (dummy variable coded 1 for female) the odds ratio is

straightforward

holding income constant, women’s odds of voting are nearly twice those of men

2

/ (1 ) ˆexp( ) exp(0.67) 1.95/ (1 )

f f f

mm m

p p

p p

Page 55: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Interpreting Parameters

for x (continuous variable for income in thousands of dollars) the odds ratio is a multiplicative effect suppose we increase income by 1 unit ($1,000)

suppose we increase income by c units (c х $1,000$

11

1

ˆexp[ ( 1)](1)] 1.01

ˆexp[

exp( )

x

x

11

1

ˆexp[ ( )]( )]

ˆexp(exp[

)

x cc

x

Page 56: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Interpreting Parameters if income is increased by $10,000, this increases the odds of

voting by about 13%

a note on percent change in odds: if estimate of β > 0 then percent increase in odds for a unit change in

x is

if estimate of β < 0 then percent decrease in odds for a unit change in x is

ˆ1) 1 0%( 0e

10 0.012 1) 100% 12.75%(e

ˆ) 1 01 0( %e

Page 57: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Marginal Effects

marginal effect: effect of change in x on change in probability

pdf cdf

often we evaluate f(.) at the mean of x.

Pr( 1| ) ( )( )i i i

i kik ik

y F

xf

x

x x

x

)(·f )(·F

Page 58: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Marginal Effect for a Change in a Continuous Variable

Page 59: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Marginal Effect of a Change in a Dummy Variable

if x is a continuous variable and z is a dummy variable

marginal effect of change in z from 0 to 1 is the difference

10 1 2( )i i iF x z

0 1 2 0 1) (( )i ix F xF

Page 60: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example logit models for high school graduation

odds ratios (constant is baseline odds)

Page 61: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

LR Test

Model 3 vs. 2

22

(1) 3log )

2( 1240.70 ( 1038.39))

2(1038.39

2(l

1240.70)

404.6

o

4

g L L

Page 62: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Wald Test Test equality of parental education effects

logit hsg blk hsp female nonint inc nsibs mhs mcol fhs fcol wtesttest mhs=fhstest mcol=fcol

Prob > chi2 = 0.2770 chi2( 1) = 1.18

( 1) mcol - fcol = 0

. test mcol=fcol

Prob > chi2 = 0.9177 chi2( 1) = 0.01

( 1) mhs - fhs = 0

cannot reject H of equal parental education effects on HS graduation

0 mhs fhs

0 mcol fcol

:

:

H

H

Page 63: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Basic Estimation Commands (Stata)

* model 0 - null modelqui logit hsgest store m0* model 1 - race, sex, family structurequi logit hsg blk hsp female nonintest store m1* model 1a - race X family structure interactionsqui xi: logit hsg blk hsp female nonint i.nonint*i.blk i.nonint*i.hspest store m1alrtest m1 m1a* model 2 - SESqui xi: logit hsg blk hsp female nonint inc nsibs mhs mcol fhs fcol est store m2 * model 3 - Indivqui xi: logit hsg blk hsp female nonint inc nsibs mhs mcol fhs fcol wtestest store m3lrtest m2 m3

estimation commands model tests

Page 64: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Fit Statistics etc.* some 'hand' calculations with saved resultsscalar ll = e(ll)scalar npar = e(df_m)+1scalar nobs = e(N)scalar AIC = -2*ll + 2*nparscalar BIC = -2*ll + log(nobs)*npar scalar list AICscalar list BIC

* or use automated fitstat routinefitstat

*output as a table

estout1 m0 m1 m2 m3 using modF07, replace star stfmt(%9.2f %9.0f %9.0f) /// stats(ll N df_m) eform

Page 65: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Analysis of Deviance

(Assumption: m2 nested in m3) Prob > chi2 = 0.0000Likelihood-ratio test LR chi2(1) = 404.64

. lrtest m2 m3

(Assumption: m1 nested in m2) Prob > chi2 = 0.0000Likelihood-ratio test LR chi2(6) = 283.71

. lrtest m1 m2

(Assumption: m0 nested in m1) Prob > chi2 = 0.0000Likelihood-ratio test LR chi2(4) = 118.45

. lrtest m0 m1

Page 66: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

BIC and AIC (using fitstat)

BIC used by Stata: 2173.993 AIC used by Stata: 2100.754BIC: -24607.056 BIC': -717.672AIC: 0.636 AIC*n: 2100.754Count R2: 0.857 Adj Count R2: 0.096Variance of y*: 6.240 Variance of error: 3.290McKelvey & Zavoina's R2: 0.473 Efron's R2: 0.252ML (Cox-Snell) R2: 0.217 Cragg-Uhler(Nagelkerke) R2: 0.372McFadden's R2: 0.280 McFadden's Adj R2: 0.271 Prob > LR: 0.000D(3293): 2076.754 LR(11): 806.807Log-Lik Intercept Only: -1441.781 Log-Lik Full Model: -1038.377

Measures of Fit for logit of hsg

Page 67: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Marginal Effects0

.2.4

.6.8

1P

r(y=

1)

-4 -2 0 2 4Test Score

white/intact white/nonintactblack/intact black/nonintact

Marginal Effect of Test Score on High School GraduationIncome Quartile 1

Page 68: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Marginal Effects0

.2.4

.6.8

1P

r(y=

1)

-4 -2 0 2 4Test Score

white/intact white/nonintactblack/intact black/nonintact

Marginal Effect of Test Score on High School GraduationIncome Quartile 4

Page 69: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

qui sum adjinc, det* quartiles for income distributiongen incQ1 = adjinc < r(p25)gen incQ2 = adjinc >= r(p25) & adjinc < r(p50)gen incQ3 = adjinc >= r(p50) & adjinc < r(p75)gen incQ4 = adjinc >= r(p75)gen incQ = 1 if incQ1==1 replace incQ = 2 if incQ2==1 replace incQ = 3 if incQ3==1 replace incQ = 4 if incQ4==1tab incQ

Generate Income Quartiles

Page 70: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

* look at marginal effects of test score on graduation by selected groups* (1) model (income quartiles)local i = 1 while `i' < 5 {logit hsg blk female mhs nonint nsibs urban so wtest if incQ ==`i'margeff

cap drop wm*cap drop bm*prgen wtest, x(blk=0 female=0 mhs=1 nonint=0) gen(wmi) from(-3) to(3)prgen wtest, x(blk=0 female=0 mhs=1 nonint=1) gen(wmn) from(-3) to(3)label var wmip1 "white/intact"label var wmnp1 "white/nonintact"prgen wtest, x(blk=1 female=0 mhs=1 nonint=0) gen(bmi) from(-3) to(3)prgen wtest, x(blk=1 female=0 mhs=1 nonint=1) gen(bmn) from(-3) to(3)label var bmip1 "black/intact"label var bmnp1 "black/nonintact"

Fit Model for Each Quartile calculate predictions

Page 71: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

set scheme s2mono twoway (line wmip1 wmix, sort xtitle("Test Score") ytitle("Pr(y=1)")) /// (line wmnp1 wmix, sort) (line bmip1 wmix, sort) (line bmnp1 wmix, sort), /// subtitle("Marginal Effect of Test Score on High School Graduation" /// "Income Quartile `i'" ) saving(wtgrph`i', replace) graph export wtgrph`i'.eps, as(eps) replacelocal i = `i' + 1}

Graph

Page 72: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Fitted Probabilitieslogit hsg blk female mhs nonint inc nsibs urban so wtestprtab nonint blk female

1 0.8329 0.9480 0.8585 0.9569 0 0.9111 0.9740 0.9258 0.9786 nonint 0 1 0 1 0 1 female and blk

logit: Predicted probabilities of positive outcome for hsg

Page 73: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Fitted Probabilities predicted values

evaluate fitted probabilities at the sample mean values of x (or other fixed quantities)

averaging fitted probabilities over subgroup-specific models will produce marginal probabilities

exp(

1 exp

ˆ)ˆˆ ( )ˆ )(

p

x

xx

1

ˆˆ ( )1 jn

ij ij j j

j

yn

p

x

Page 74: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Observed & Fitted Probabilities

family type white black white blackintactobserved 0.90 0.86 0.91 0.89fitted 0.91 0.97 0.93 0.98n 776 224 749 234nonintactobserved 0.71 0.74 0.81 0.82fitted 0.83 0.95 0.86 0.96n 220 207 196 231Total 996 431 945 465

sex

race racemale female

Page 75: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Alternative Probability Model complementary log –log (cloglog or CLL)

standard extreme-value distribution for u:

cloglog model:

cloglog link function:

( ) exp( )exp exp( )f u u u

( ) 1 exp exp( )F u u

Pr( 1) 1 ex exp(p )i iy x

log log[1 Pr( 1)]i iy x

Page 76: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Extreme-Value Distribution properties

mean of u (Euler’s constant):

variance of u:

difference in two independent extreme value variables yields a logistic variable

(1) 0.5772

2

6

2

1 2

3  logistic(0, )

u u

Page 77: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

CLL Transformation

-6 -4 -2 0 2

0.0

0.2

0.4

0.6

0.8

1.0

CLL

p

Page 78: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

CLL Model

no “practical” differences from logit and probit models

often suited for survival data and other applications interpretation of coefficients:

exp(β) is a relative risk or hazard ratio not an OR glm: binomial distribution for y with a cloglog link cloglog: use the cloglog command directly

Page 79: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

CLL and Logit Model Comparedlogit cloglog

blk 3.658*** 1.987***

female 1.218 1.128*

mhs 1.438** 1.161*

nonint 0.487*** 0.710***

inc 1.635** 1.236**

nsibs 0.938** 0.965**

urban 0.887 0.942

so 1.269 1.115

wtest 5.151*** 2.171***

_cons 6.851*** 1.891***

log L -838.92 -833.96

N 2837 2837

df 9 9

Page 80: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Cloglog and Logit Model Compared

P2 1.694049 .2045987 4.36 0.000 1.336971 2.146494 A3 .4323768 .0867538 -4.18 0.000 .2917924 .6406942 A2 1.12164 .2412759 0.53 0.594 .7357857 1.709839 d Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] OIM

more agreement when modeling rare events

P2 1.684947 .2016957 4.36 0.000 1.332581 2.130487 A3 .4350801 .0864137 -4.19 0.000 .2947864 .642142 A2 1.119414 .2380893 0.53 0.596 .7378156 1.698375 d exp(b) Std. Err. z P>|z| [95% Conf. Interval] OIM

logit

cloglog

Page 81: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Extensions: Multilevel Data

what is multilevel data? individuals are “nested” in a larger context:

children in families, kids in schools etc.

context 1

context 3

context 2

Page 82: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Data i.i.d. assumptions?

the outcomes for units in a given context could be associated

standard model would treat all outcomes (regardless of context) as independent

multilevel methods account for the within-cluster dependence

a general problem with binomial responses we assume that trials are independent this might not be realistic non-independence will inflate the variance

(overdispersion)

Page 83: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Data example (in book):

40 universities as units of analysis for each university we observe the number of graduates

(n) and the number receiving post-doctoral fellowships (y)

we could compute proportions (MLEs) some proportions would be “better” estimates as they

would have higher precision or lower variance example: the data y1/n1 = 2/5 and y2/n2 = 20/50 give

identical estimates of p but variances of 0.048 and 0.0048 respectively

the 2nd estimate is more precise than the 1st

Page 84: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Data multilevel models allow for improved

predictions of individual probabilities MLE estimate is unaltered if it is precise MLE estimate moved toward average if it is

imprecise (shrinkage) multilevel estimate of p would be a weighted average of

the MLE and the average over all MLEs (weight (w) is based on the variance of each MLE and the variance over all the MLEs)

we are generally less interested in the p’s and more interested in the model parameters and variance components

ˆ (1 )i i i ip w pp w

Page 85: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Shrinkage Estimation primitive approach

assume we have a set of estimates (MLEs) our best estimate of the variance of each MLE is

this is the within variance (no pooling) if this is large, then the MLE is a poor estimate

a better estimate might be the average of the MLEs in this case (pooling the estimates)

we can average the MLEs and estimate the between variance as

ˆ(1 ))

ˆˆvar( i ii

i

p pp

n

2ˆ) (1

ar( )v ip pN

p

ˆ ip

Page 86: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Shrinkage Estimation primitive approach

we can then estimate a weight wi

a revised estimate of pi would take account of the precision to for a precision-weighted average precision is a function of ni

more weight is given to more precise MLE’s

) between-groupar(

var(

varianceˆ) var( ) total varianceii

pw

p p

v

ˆ (1 )i i i ip w pp w

Page 87: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Shrinkage: a primitive approach

0 10 20 30 40

0.2

0.4

0.6

0.8

university

obse

rved

and

shr

unke

n pr

obab

ilitie

s

ObservedShrunken

Page 88: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Shrinkage

0 10 20 30 40

0.2

0.4

0.6

0.8

university

obse

rved

and

EB

pro

babi

litie

s

ObservedEB Estimate

results from full Bayesian (multilevel) Analysis

Page 89: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Extension: Multilevel Models assumptions

within-context and between-context variation in outcomes

individuals within the same context share the same “random error” specific to that context

models are hierarchical individuals (level-1) contexts (level-2)

Page 90: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models: Background linear mixed model for continuous y (multilevel, random coefficients, etc.)

level-1 model and level-2 sub-models (hierarchical)

0 1

0 00 01 0

1 10 11 1

ij i i ij ij

i i i

i i i

z

x u

x u

y

Page 91: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models: Background linear mixed model assumptions

level-1 and level-2 residuals

2

0

1

20 01

201 1

~ Normal(0, )

0~ MVN ,

0

where

u

u

u

u

Page 92: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models: Background composite form

00 01 10 11 0 1ij i ij i ij i ij i ijx z x z uy z u

fixed effectscross-level interaction

random effects (level-2)

composite residual

Page 93: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models: Background variance components

0 1

0 1

)

within group: var

total: va

( )

between group: va

r

)

(

r(

i i ij ij

ij

i i ij

u u z

u u z

Page 94: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models: Background general form (linear mixed model)

ij ij ij i ijy x z u

variables associated with fixed coefficients

variables associated with random coefficients

Page 95: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models: Logit Models binomial model (random effect)

assumptions

u increases or decreases the expected response for individual j in context i independently of x

all individuals in context i share the same value of u also called a random intercept model

logit( ) ij ij ip u x

2~ Normal(0, )i uu

0 0i iu

Page 96: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models a hierarchical model:

z is a level-1 variable; x is a level-2 variable random intercept varies among level-2 units note: level-1 residual variance is fixed (why?)

0 1

0 00 01

logit( )=

and

ij i ij

i i i

p z

x u

Page 97: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models a general expression

x are variables associated with “fixed” coefficients z are variables associated with “random” coefficients u is multivariate normal vector of level-2 residuals mean of u is 0; covariance of u is

logit( ) ij ij ij ip x z u

u

Page 98: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models random effects vs. random coefficients

random effects u random coefficients β + u

variance components interested in level-2 variation in u

prediction E(y) is not equal to E(y|u) model based predictions need to consider random

effectsE( | , ) ( )ij i ij ij iy u u xx

Page 99: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Models: Generalized Linear Mixed Models (GLMM)

E( | , ) ( )ij i ij ij iy u u xx Conditional Expectation

| ) E[E( | , )E( ]ij ij ij i ijy y ux x

Marginal Expectation

( ) ( )dij i

u

u g u u x

requires numerical integration or simulation

Page 100: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Data Structure multilevel data structure

requires a “context” id to identify individuals belonging to the same context

NLSY sibling data contains a “family id” (constructed by researcher)

data are unbalanced (we do not require clusters to be the same size)

small clusters will contribute less information to the estimation of variance components than larger clusters

it is OK to have clusters of size 1

(i.e., an individual is a context unto themselves) clusters of size 1 contribute to the estimation of fixed

effects but not to the estimation of variance components

Page 101: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example: clustered data siblings nested in families

y is 1st premarital birth for NLSY women select sib-ships of size > 2 null model (random intercept):

xtlogit fpmbir, i(famid)

or

xtmelogit fpmbir || famid:

Page 102: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example: clustered data

Likelihood-ratio test of rho=0: chibar2(01) = 20.58 Prob >= chibar2 = 0.000 rho .4730808 .0995195 .2910546 .662556 sigma_u 1.71864 .3430707 1.162171 2.541556 /lnsig2u 1.083066 .3992351 .30058 1.865553 _cons -2.888895 .3318566 -8.71 0.000 -3.539322 -2.238468 fpmbir Coef. Std. Err. z P>|z| [95% Conf. Interval]

Log likelihood = -228.59345 Prob > chi2 = .

random intercept: xtlogit

Page 103: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example: clustered data

LR test vs. logistic regression: chibar2(01) = 20.73 Prob>=chibar2 = 0.0000 sd(_cons) 1.752456 .3601534 1.171423 2.621685famid: Identity Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

_cons -2.917541 .3479598 -8.38 0.000 -3.59953 -2.235552 fpmbir Coef. Std. Err. z P>|z| [95% Conf. Interval]

Log likelihood = -228.51781 Prob > chi2 = .Integration points = 7 Wald chi2(0) = .

random intercept: xtmelogit

Page 104: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Variance Component add predictors (mostly level-2)

sd(_cons) 1.451511 .3515003 .9030084 2.333182famid: Identity Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

weekly .885648 .296273 -0.36 0.717 .4597391 1.706125 consprot 1.614657 .6110603 1.27 0.206 .7690355 3.390111 inc .8848917 .2858459 -0.38 0.705 .4698153 1.666683 medu .8050785 .060073 -2.91 0.004 .6955425 .9318647 nsibs 1.112501 .1032876 1.15 0.251 .9274119 1.33453 nonint 3.356608 1.435222 2.83 0.005 1.451921 7.759938 fpmbir Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

Log likelihood = -215.39646 Prob > chi2 = 0.0010Integration points = 7 Wald chi2(6) = 22.48

Page 105: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Variance Component conditional variance in u is 2.107 proportionate reduction in error (PRE)

a 31% reduction in level-2 variance when level-2 predictors are accounted for

2 2

2

3.062 2.107PRE 0.312

3.062r c

r

u u

u

Page 106: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Random Effects we can examine the distribution of random effects

01

23

Den

sity

-1 0 1 2 3random effects for famid: _cons

Page 107: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Random Effects we can examine the distribution of random effects

99% 2.405483 2.583755 Kurtosis 4.81897195% 1.523062 2.583755 Skewness 1.68802690% 1.337971 2.431446 Variance .490946275% -.0689377 2.431446 Largest Std. Dev. .700675650% -.1484184 Mean .1132598

25% -.2422871 -.8339383 Sum of Wgt. 65310% -.388522 -.8339383 Obs 653 5% -.5100672 -.9210778 1% -.7111417 -.9210778 Percentiles Smallest random effects for famid: _cons

. sum u, detail

Page 108: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Random Effects Distribution 90th percentile u90 = 1.338

10th percentile u10 = 0.388

the risk for family at 90th percentile is

exp(1.338 – 0.388) = 2.586

times higher than for a family at the 10th percentile

even if families are compositionally identical on covariates, we can assess the hypothetical differential in risks

Page 109: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Growth Curve Models growth models

individuals are level-2 units repeated measures over time on individuals

(level-1) models imply that logits vary across individuals

intercept (conditional average logit) varies slope (conditional average effect of time) varies change is usually assumed to be linear

use GLMM complications due to dimensionality intercept and slope may co-vary (necessitating a more

complex model) and more

Page 110: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Growth Curve Models multilevel logit model for change over time

T is time (strictly increasing) fixed and random coefficients (with covariates)

assume that u0 and u1 are bivariate normal

0 1logit( )ij i i ijp T

0 1

0 00 01 0

1 10 11 1

logit( )ij i i ij

i i i

i i i

p T

X u

X u

Page 111: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Multilevel Logit Models for Change Example: Log odds of employment of black

men in the U.S. 1982-1988 (NLSY) (consider 5 years in this period)

time is coded 0, 1, 3, 4, 6 dependent variable is: not-working, not-in-school unconditional growth (no covariates except T) conditional growth (add covariates) note: cross-level interactions implied by composite

model

00 01 10 11 0 1logit( )ij ij ij i i i ijp X T uT X u T

Page 112: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Fitting Multilevel Model for Change programming

Stata (unconditional growth)

Stata (conditional growth)

xtmelogit y year || id: year, var cov(un)

xtmelogit y year south unem unemyr inc hs ||id: year, var cov(un)

Page 113: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Fitting Multilevel Model for Change

LR test vs. logistic regression: chi2(3) = 250.61 Prob > chi2 = 0.0000 cov(year,_cons) -.0517392 .0789636 -.206505 .1030266 var(_cons) 1.796561 .4330881 1.120075 2.881622 var(year) .0552714 .0241599 .0234654 .1301886id: Unstructured Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

_cons -.8742502 .0972809 -8.99 0.000 -1.064917 -.6835831 year -.1467877 .0293921 -4.99 0.000 -.2043952 -.0891801 y Coef. Std. Err. z P>|z| [95% Conf. Interval]

Log likelihood = -1916.0409 Prob > chi2 = 0.0000Integration points = 7 Wald chi2(1) = 24.94

max = 5 avg = 5.0 Obs per group: min = 5

Group variable: id Number of groups = 686Mixed-effects logistic regression Number of obs = 3430

Page 114: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Fitting Multilevel Logit Model for Change

LR test vs. logistic regression: chi2(3) = 140.20 Prob > chi2 = 0.0000 cov(year,_cons) -.0622441 .0708861 -.2011783 .07669 var(_cons) 1.304833 .3648705 .7542816 2.257233 var(year) .0433477 .0219905 .016038 .1171612id: Unstructured Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

_cons -.0612559 .1285939 -0.48 0.634 -.3132954 .1907836 hs -.785545 .1242026 -6.32 0.000 -1.028978 -.5421124 inc -.5732738 .1872211 -3.06 0.002 -.9402205 -.2063271 unemyr -.1120936 .0641975 -1.75 0.081 -.2379184 .0137313 unem 1.014915 .2408795 4.21 0.000 .5428002 1.48703 south -.6523682 .1283314 -5.08 0.000 -.9038931 -.4008434 year -.0921512 .0281795 -3.27 0.001 -.1473819 -.0369205 y Coef. Std. Err. z P>|z| [95% Conf. Interval]

Log likelihood = -1868.0104 Prob > chi2 = 0.0000Integration points = 7 Wald chi2(6) = 123.80

max = 5 avg = 5.0 Obs per group: min = 5

Group variable: id Number of groups = 686Mixed-effects logistic regression Number of obs = 3430

Page 115: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Logits: Observed, Conditional, and Marginal

the log odds of idleness decreases with time and shows variation in level and change

Page 116: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Composite Residuals in a Growth Model composite residual

composite residual variance

covariance of composite residual

0 1ij i i ij ijr u u T

22 2 20 1 01var(

3) 2ij j jr T T

2 20 1 01, ) ( )cov( ij ij j j j jr T T T Tr

Page 117: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Model covariance term is 0 (from either model)

results in simplified interpretation easier estimation via variance components (default option)

significant variation in slopes and initial levels other results:

log odds of idleness decrease over time (negative slope) other covariates except county unemployment have significant

effects on the odds of idleness the main effects are interpreted as effects on initial logits at time 1

or t = 0 or the 1982 baseline) interaction of time and unemployment rate captures the effect of

county unemployment rate in 1982 on the change log odds of idleness

the positive effect implies that higher county unemployment tends to dampen change in odds

Page 118: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT Models IRT models

Item Response Theory models account for an individual-level random effect on

a set of items (i.e., ability) items are assumed to tap a single latent construct

(aptitude on a specific subject) item difficulty

test items are assumed to be ordered on a difficulty scale easier harder expected patterns emerge whereby if a more difficult

item is answered correctly the easier items are likely to have been answered correctly

Page 119: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT Models IRT models

1-parameter logistic (Rasch) model

pij individual i’s probability of a correct response on the jth item

θ individual i’s ability b item j’s difficulty

properties an individual’s ability parameter is invariant with respect to the

item the difficulty parameter is invariant with respect to individual’s

ability higher ability or lower item difficulty lead to a higher probability

of a correct response both ability and difficulty are measured on the same scale

logit( )ij i jp b

Page 120: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

ICC

item characteristics curve (item response curve) depicts the probability of a correct response as a function

of an examinee’s ability or trait level curves are shifted rightward with increasing item difficulty assume that item 3 is more difficult than item 2 and item 2

is more difficult than item 1 probability of a correct response decreases as the

threshold θ = bj is crossed, reflecting increasing item difficulty

Page 121: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT Models: ICC (3 Items)

jb slopes of item characteristics curves are equal when ability = item difficulty

Page 122: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Estimation as GLMM specification:

set up a person-item data structure define x as a set of dummy variables change signs on β to reflect “difficulty” fit model without intercept to estimate all item difficulties normalization is common

logit( )ij j i

ij i

p u

u

x

2

1

0 and 1.0J

j uj

Page 123: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

PL1 Estimation Stata (data set up )

clearset memory 128minfile junk y1-y5 f using LSAT.datdrop if junk==11 | junk==13expand fdrop f junkgen cons = 1collapse (sum) wt2=cons, by(y1-y5)gen id = _nsort idreshape long y, i(id) j(item)

Page 124: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

PL1 Estimation Stata (model set up )

gen i1 = 0gen i2 = 0gen i3 = 0gen i4 = 0gen i5 = 0replace i1 = 1 if item == 1replace i2 = 1 if item == 2replace i3 = 1 if item == 3replace i4 = 1 if item == 4replace i5 = 1 if item == 5** 1PL * constrain sd=1cons 1 [id1]_cons = 1gllamm y i1-i5, i(id) weight(wt) nocons family(binom) cons(1) link(logit) adapt

Page 125: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

PL1 Estimation Stata (output )

------------------------------------------------------------------------------ var(1): 1 (0) ***level 2 (id)

------------------------------------------------------------------------------Variances and covariances of random effects i5 2.218779 .104828 21.17 0.000 2.01332 2.424238 i4 1.388057 .086496 16.05 0.000 1.218528 1.557586 i3 .2576052 .0765907 3.36 0.001 .1074903 .4077202 i2 1.063026 .0821146 12.95 0.000 .902084 1.223967 i1 2.871972 .1287498 22.31 0.000 2.619627 3.124317 Coef. Std. Err. z P>|z| [95% Conf. Interval] log likelihood = -2473.054321704064 ( 1) [id1]_cons = 1gllamm model with constraints: Condition Number = 1.8420141 number of level 2 units = 1000number of level 1 units = 5000

Page 126: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

PL1 Estimation Stata (parameter normalization)

* normalized solution *[1 -- standard 1PL] *[2 -- coefs sum to 0] [var = 1]mata bALL = st_matrix("e(b)") b = -bALL[1,1..5] mb = mean(b') bs = b:-mb("MML Estimates", "IRT parameters", "B-A Normalization") (-b', b', bs')end

Page 127: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

PL1 Estimation

Stata (normalized solution)

param MML Estimates IRT Normalized

1 2.87 -2.87 -1.31

2 1.06 -1.06 0.50

3 0.26 -0.26 1.30

4 1.39 -1.39 0.17

5 2.22 -2.22 -0.66

Page 128: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT: Extensions 2-parameter logistic (2PL) model

) (lo t( )gi ij j i j

j j i

ij ij i

a b

u

p

u

x x

jj

j

b

is a factor loading on the random ef c fe tj

item discrimination parameters

0 and 1 (normalization)j jj j

b a

Page 129: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT: Extensions 2-parameter logistic (2PL) model

item discrimination parameters reveal differences in item’s utility to distinguish different

ability levels among examinees high values denote items that are more useful in terms of

separating examinees into different ability levels low values denote items that are less useful in

distinguishing examinees in terms of ability ICCs corresponding to this model can intersect as they

differ in location and slope steeper slope of the ICC is associated with a better

discriminating item

Page 130: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT: Extensions 2-parameter logistic (2PL) model

Page 131: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT: Extensions 2-parameter logistic (2PL) model

Stata (estimation)eq id: i1 i2 i3 i4 i5cons 1 [id1_1]i1 = 1gllamm y i1-i5, i(id) weight(wt) nocons family(binom) link(logit) frload(1) eqs(id) cons(1) adaptmatrix list e(b)*normalized solutions *1 standard 2PL) mata bALL = st_matrix("e(b)") b = bALL[1,1..5] c = bALL[1,6..10] a = -b:/c("MML Estimates-Dif", "IRT Parameters")(b', a')("MML Discrimination Parameters")(c')end

Page 132: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT: Extensions 2-parameter logistic (2PL) model

Stata (estimation)* Bock and Aitkin Normalization (p. 164 corrected)mata bALL = st_matrix("e(b)") b = -bALL[1,1..5] c = bALL[1,6..10] lc = ln(c) mb = mean(b') mc = mean(lc') bs = b:-mb cs = exp(lc:-mc)("B-A Normalization DIFFICULTY", "B-A Normalization DISCRIMINATION")(bs', cs')end

Page 133: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT: 2PL (1)

i5: .65684452 (.20990788) i4: .68836241 (.18513868) i3: .890914 (.2328178) i2: .72273928 (.18667773) i1: .82565942 (.25811315) loadings for random effect 1 var(1): 1 (0) ***level 2 (id)

------------------------------------------------------------------------------Variances and covariances of random effects i5 2.053265 .1353574 15.17 0.000 1.78797 2.318561 i4 1.284755 .0990363 12.97 0.000 1.090647 1.478862 i3 .24915 .0762746 3.27 0.001 .0996546 .3986454 i2 .9901996 .0900182 11.00 0.000 .8137672 1.166632 i1 2.773234 .205743 13.48 0.000 2.369985 3.176483 Coef. Std. Err. z P>|z| [95% Conf. Interval] log likelihood = -2466.653343760672

Page 134: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

IRT: 2PL (2) Bock-Aitkin Normalization

itemItem Difficulty

ParameterDiscrimination

Parameter1 -1.30 1.102 0.48 0.963 1.22 1.184 0.19 0.925 -0.58 0.87

check 0 1

B-A Normalization

item 3 has highest difficulty and greatest discrimination

Page 135: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

1PL and 2PL

Page 136: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

1PL and 2PL

Page 137: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Binary Response Models for Event Occurrence discrete-time event-history models

purpose: model the probability of an event occurring at some point

in time Pr(event at t | event has not yet occurred by t)

life table events & trials observe the number of events occurring to those who

are at remain at risk as time passes takes account of the changing composition of the sample

as time passes

Page 138: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Life Table

Page 139: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Life Table observe

Rj number at risk in time interval j (R0 = n), where the number at risk in interval j is adjusted over time

Dj events in time interval j (D0 = 0)

Wj removed from risk (censored) in time interval j (W0 = 0)

(removed from risk due to other unrelated causes)

1 1 1j j j jR DR W

Page 140: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Life Table other key quantities

discrete-time hazard (event probability in interval j)

surviving fraction (survivor function in interval j)

ˆ jj

j

pD

R

1

ˆ ˆ(1 )j

j kk

S p

Page 141: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Discrete-Time Hazard Models statistical concepts

discrete random variable Ti (individual’s event or censoring time)

pdf of T (probability that individual i experiences event in period j)

cdf of T (probability that individual i experiences event in period j or earlier)

survivor function (probability that individual i survives past period j)

) Pr( )( ij if t T j

1

) Pr (( ( ) )j

ij i ikk

T jF f tt

) Pr( ) 1 ( )( ij i ijT jS F tt

Page 142: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Discrete-Time Hazard Models statistical concepts

discrete hazard

the conditional probability of event occurrence in interval j for individual i given that an event has not already occurred to that individual by interval j

Pr( | )ij i ip T j T j

Page 143: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Discrete-Time Hazard Models equivalent expression using binary data

binary data dij = 1 if individual i experiences an event in interval j, 0 otherwise

use the sequence of binary values at each interval to form a history of the process for individual i up to the time the event occurs

discrete hazard

1 2 1Pr( 1| 0, 0, , 0)ij ij ij ij idp d d d

Page 144: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Discrete-Time Hazard Models modeling (logit link)

modeling (complementary log –log link)

non-proportional effects

exp( )

1 exp( )j ij

ijj ij

p

x

x

1 exp exp( )ij j ijp x

logit( )ij j ij jp x

Page 145: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Data Structure person-level data person-period form

Page 146: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Data Structure binary sequences

Page 147: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Estimation contributions to likelihood

contribution to log L for individual with event in period j

contribution to log L for individual censored in period j

combine

1 1

log (1 logg (1l )o )jn

ik ik ik iki k

pL d pd

Pr( ) ( ) if 1,

Pr( ) ( ) if 0.i ij ij

ii ij ij

T j f t dL

T j S t d

1

1

log llo og(g 1 )j

i ij ij ikk

pL d p

1

loglo (1 )gj

i ikk

L p

Page 148: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example: dropping out of Ph.D. programs (large US university)

data: 6,964 individual histories spanning 20 years dropout cannot be distinguished from other types of

leaving (transfer to other program etc.) model the logit hazard of leaving the originally-entered

program as a function of the following: time in program (the time-dependent) baseline hazard) female and percent female in program race/ethnicity (black, Hispanic, Asian) marital status GRE score

also add a program-specific random effect (multilevel)

Page 149: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example:

Page 150: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example:

Page 151: Categorical Data Analysis Week 2. Binary Response Models binary and binomial responses  binary: y assumes values of 0 or 1  binomial: y is number of

Example:

clearset memory 512minfile CID devnt I1-I5 female pctfem black hisp asian married gre using DT28432.datlogit devnt I1-I5, nocons orest store m1logit devnt I1-I5 female pctfem, nocons orest store m2logit devnt I1-I5 female pctfem black hisp asian , nocons orest store m3logit devnt I1-I5 female pctfem black hisp asian married, nocons orest store m4logit devnt I1-I5 female pctfem black hisp asian married gre , nocons or