discrete choice modeling

27
Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004 http://cemmap.ifs.org.uk/resources/files/resources _greene_discrete.shtml

Upload: sinead

Post on 17-Jan-2016

44 views

Category:

Documents


0 download

DESCRIPTION

William Greene Stern School of Business IFS at UCL February 11-13, 2004. Discrete Choice Modeling. http://cemmap.ifs.org.uk/resources/files/resources_greene_discrete.shtml. Part 3. Modeling Binary Choice. A Model for Binary Choice. Yes or No decision (Buy/Not buy) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Discrete Choice Modeling

Discrete Choice Modeling

William GreeneStern School of BusinessIFS at UCLFebruary 11-13, 2004

http://cemmap.ifs.org.uk/resources/files/resources_greene_discrete.shtml

Page 2: Discrete Choice Modeling

Part 3

Modeling Binary Choice

Page 3: Discrete Choice Modeling

A Model for Binary Choice Yes or No decision (Buy/Not buy) Example, choose to fly or not to fly to a destination

when there are alternatives. Model: Net utility of flying Ufly = +1Cost + 2Time + Income + Choose to fly if net utility is positive Data: X = [1,cost,terminal time] Z = [income]

y = 1 if choose fly, Ufly > 0, 0 if not.

Page 4: Discrete Choice Modeling

What Can Be Learned from the Data? (A Sample of Consumers, i = 1,…,N)

• Are the attributes “relevant?”

• Predicting behavior

- Individual

- Aggregate

• Analyze changes in behavior when

attributes change

Page 5: Discrete Choice Modeling

Application 210 Commuters Between Sydney and

Melbourne Available modes = Air, Train, Bus, Car Observed:

Choice Attributes: Cost, terminal time, other Characteristics: Household income

First application: Fly or other

Page 6: Discrete Choice Modeling

Binary Choice Data

Choose Air Gen.Cost Term Time Income1.0000 86.000 25.000 70.000.00000 67.000 69.000 60.000.00000 77.000 64.000 20.000.00000 69.000 69.000 15.000.00000 77.000 64.000 30.000.00000 71.000 64.000 26.000.00000 58.000 64.000 35.000.00000 71.000 69.000 12.000.00000 100.00 64.000 70.0001.0000 158.00 30.000 50.0001.0000 136.00 45.000 40.0001.0000 103.00 30.000 70.000.00000 77.000 69.000 10.0001.0000 197.00 45.000 26.000.00000 129.00 64.000 50.000.00000 123.00 64.000 70.000

Page 7: Discrete Choice Modeling

An Econometric Model Choose to fly iff UFLY > 0

Ufly = +1Cost + 2Time + Income + Ufly > 0

> -(+1Cost + 2Time + Income) Probability model: For any person observed by the

analyst, Prob(fly) = Prob[ > -(+1Cost + 2Time + Income)]

Note the relationship between the unobserved and the outcome

Page 8: Discrete Choice Modeling

A Regression - Like Model

INDEX

.2

.4

.6

.8

1.0

.0-1.8 -.6 .6 1.8 3.0-3.0

Pr[

Fly

]

+1Cost + 2TTime + Income

Page 9: Discrete Choice Modeling

Econometrics How to estimate , 1, 2, ?

It’s not regression The technique of maximum likelihood

Prob[y=1] =

Prob[ > -(+1Cost + 2Time + Income)] Prob[y=0] = 1 - Prob[y=1]

Requires a model for the probability

0 1Prob[ 0] Prob[ 1]

y yL y y

Page 10: Discrete Choice Modeling

Completing the Model: F() The distribution

Normal: PROBIT, natural for behavior Logistic: LOGIT, allows “thicker tails” Gompertz: EXTREME VALUE, asymmetric, underlies

the basic logit model for multiple choice Does it matter?

Yes, large difference in estimates Not much, quantities of interest are more stable.

Page 11: Discrete Choice Modeling
Page 12: Discrete Choice Modeling

Estimated Binary Choice Model+---------------------------------------------+| Binomial Probit Model || Maximum Likelihood Estimates || Model estimated: Jan 20, 2004 at 04:08:11PM.|| Dependent variable MODE || Weighting variable None || Number of observations 210 || Iterations completed 6 || Log likelihood function -84.09172 || Restricted log likelihood -123.7570 || Chi squared 79.33066 || Degrees of freedom 3 || Prob[ChiSqd > value] = .0000000 || Hosmer-Lemeshow chi-squared = 46.96547 || P-value= .00000 with deg.fr. = 8 |+---------------------------------------------++---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Index function for probability Constant .43877183 .62467004 .702 .4824 GC .01256304 .00368079 3.413 .0006 102.647619 TTME -.04778261 .00718440 -6.651 .0000 61.0095238 HINC .01442242 .00573994 2.513 .0120 34.5476190

Page 13: Discrete Choice Modeling

Estimated Binary Choice Models

LOGIT PROBIT EXTREME VALUE

Variable Estimate t-ratio Estimate t-ratio Estimate t-ratio

Constant 1.78458 1.40591 0.438772 0.702406 1.45189 1.34775

GC 0.0214688 3.15342 0.012563 3.41314 0.0177719 3.14153

TTME -0.098467 -5.9612 -0.0477826 -6.65089 -0.0868632 -5.91658

HINC 0.0223234 2.16781 0.0144224 2.51264 0.0176815 2.02876

Log-L -80.9658 -84.0917 -76.5422

Log-L(0) -123.757 -123.757 -123.757

Page 14: Discrete Choice Modeling

A Regression - Like Model

INDEX

.2

.4

.6

.8

1.0

.0-1.8 -.6 .6 1.8 3.0-3.0

Pr[

Fly

]

+1Cost + 2Time + (Income+1)

Effect on predicted probability of an increase in income

( is positive)

Page 15: Discrete Choice Modeling

How Well Does the Model Fit? There is no R squared “Fit measures” computed from log L

“pseudo R squared = 1 – logL0/logL Others… - these do not measure fit.

Direct assessment of the effectiveness of the model at predicting the outcome

Page 16: Discrete Choice Modeling

Fit Measures for Binary Choice Likelihood Ratio Index

Bounded by 0 and 1 Rises when the model is expanded

Cramer (and others)ˆ ˆ ˆ F | = 1 - F | = 0

=

Mean y Mean y reward for correct predictions minus

penalty for incorrect predictions

Page 17: Discrete Choice Modeling

Fit Measures for the Logit Model+----------------------------------------+| Fit Measures for Binomial Choice Model || Probit model for variable MODE |+----------------------------------------+| Proportions P0= .723810 P1= .276190 || N = 210 N0= 152 N1= 58 || LogL = -84.09172 LogL0 = -123.7570 || Estrella = 1-(L/L0)^(-2L0/n) = .36583 |+----------------------------------------+| Efron | McFadden | Ben./Lerman || .45620 | .32051 | .75897 || Cramer | Veall/Zim. | Rsqrd_ML || .40834 | .50682 | .31461 |+----------------------------------------+| Information Akaike I.C. Schwarz I.C. || Criteria .83897 189.57187 |+----------------------------------------+

Pseudo – R-squared

Page 18: Discrete Choice Modeling

Predicting the Outcome

Predicted probabilities

P = F(a + b1Cost + b2Time + cIncome) Predicting outcomes

Predict y=1 if P is large Use 0.5 for “large” (more likely than not)

Count successes and failures

Page 19: Discrete Choice Modeling

Individual Predictions from a Logit Model

Observation Observed Y Predicted Y Residual x(i)b Pr[Y=1]

81 .00000 .00000 .0000 -3.3944 .0325

85 .00000 .00000 .0000 -2.1901 .1006

89 1.0000 .00000 1.0000 -2.6766 .0644

93 1.0000 1.0000 .0000 .8113 .6924

97 1.0000 1.0000 .0000 2.6845 .9361

101 1.0000 1.0000 .0000 2.4457 .9202

105 1.0000 .00000 1.0000 -3.2204 .0384

109 1.0000 1.0000 .0000 .0311 .5078

113 .00000 .00000 .0000 -2.1704 .1024

117 .00000 .00000 .0000 -3.3729 .0332

445 .00000 1.0000 -1.0000 .0295 .5074

Note two types of errors and two types of successes.

Page 20: Discrete Choice Modeling

Predictions in Binary Choice Predict y = 1 if P > P*

Success depends on the assumed P*

Page 21: Discrete Choice Modeling

ROC Curve Plot %Y=1 correctly predicted vs. %y=1

incorrectly predicted 450 is no fit. Curvature implies fit. Area under the curve compares models

Page 22: Discrete Choice Modeling
Page 23: Discrete Choice Modeling

Aggregate PredictionsFrequencies of actual & predicted outcomes

Predicted outcome has maximum probability.

Threshold value for predicting Y=1 = .5000

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 151 1 | 152

1 20 38 | 58

------ ---------- + -----

Total 171 39 | 210

Page 24: Discrete Choice Modeling

Analyzing PredictionsFrequencies of actual & predicted outcomes

Predicted outcome has maximum probability.

Threshold value for predicting Y=1 is P* .5000.

(This table can be computed with any P*.)

Predicted

------ -------------------- + -----

Actual 0 1 | Total

------ ----------------------+-------

0 N(a0,p0) N(a0,p1) | N(a0)

1 N(a1,p0) N(a1,p1) | N(a1)

------ ----------------------+ -----

Total N(p0) N(p1) | N

Page 25: Discrete Choice Modeling

Analyzing Predictions - Success

Sensitivity = % actual 1s correctly predicted = 100N(a1,p1)/N(a1) % [100(38/58)=65.5%]

Specificity = % actual 0s correctly predicted = 100N(a0,p0)/N(a0) % [100(151/152)=99.3%]

Positive predictive value = % predicted 1s that were actual 1s = 100N(a1,p1)/N(p1) % [100(38/39)=97.4%]

Negative predictive value = % predicted 0s that were actual 0s = 100N(a0,p0)/N(p0) % [100(151/171)=88.3%]

Correct prediction = %actual 1s and 0s correctly predicted = 100[N(a1,p1)+N(a0,p0)]/N [100(151+38)/210=90.0%]

Page 26: Discrete Choice Modeling

Analyzing Predictions - Failures False positive for true negative = %actual 0s predicted as 1s

= 100N(a0,p1)/N(a0) % [100(1/152)=0.668%]

False negative for true positive = %actual 1s predicted as 0s = 100N(a1,p0)/N(a1) % [100(20/258)=34.5%]

False positive for predicted positive = % predicted 1s that were actual 0s = 100N(a0,p1)/N(p1) % [100(1/39)=2/56%]

False negative for predicted negative = % predicted 0s that were actual 1s = 100N(a1,p0)/N(p0) % [100(20/171)=11.7%]

False predictions = %actual 1s and 0s incorrectly predicted = 100[N(a0,p1)+N(a1,p0)]/N [100(1+20)/210=10.0%]

Page 27: Discrete Choice Modeling

Aggregate Prediction is a Useful Way to Assess the Importance of a Variable

Frequencies of actual & predicted outcomes. Predicted outcome has maximum probability. Threshold value for predicting Y=1 = .5000

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 145 7 | 152

1 48 10 | 58

------ ---------- + -----

Total 193 17 | 210

Predicted

------ ---------- + -----

Actual 0 1 | Total

------ ---------- + -----

0 151 1 | 152

1 20 38 | 58

------ ---------- + -----

Total 171 39 | 210

Model fit without TTME

Model fit with TTME