data science for business - kaistkseworkshop.kaist.ac.kr/2014/material/2014kse-4.pdf · 2014. 2....

22
Data Science for Business 의사결정 분석적 사고 카이스트 지식서비스공학과 이문용

Upload: others

Post on 06-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Data Science for Business

의사결정분석적사고카이스트지식서비스공학과

이문용

Page 2: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Agenda

• Decision Analytic Thinking• Part I (Chapter 7): What Is a Good Model?

• Evaluating Classifiers

• Confusion Matrix

• Expected Value

• Expected Value Framework

• Part II (Chapter 11): Toward Analytical Engineering• Targeting the Best Prospect

• Assessing the Influence of the Incentive

Page 3: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

강사소개

2013-현재: 카이스트지식서비스공학과 교수(Tenured), 학과장

Associate Editor, IJHCS

The First Class Leader (중앙일보, 2013)

대한민국신지식인상 (주간인물, 2013)

문광부장관상 (2011), 교육부장관상(1988)

Best Paper Award (MWAIS, 2008)

Senior Editor, AIS Transactions on HCI

2009-2013: 카이스트지식서비스공학과 부교수

1998-2009: University of South Carolina 조교수, 부교수 (Tenured)

1981-1988: 한국전력공사직원 (원자력발전소)

University of Maryland, Ph.D., 1998

Major: Information Systems

Minor: Computer Sciences (Software Engineering)

Page 4: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Part I: What Is a Good Model?

Page 5: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Where are we?

• The Data Mining Process Revisited

Page 6: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Evaluating Classifiers

• Classification accuracy• Accuracy = 1 – error rate

• Accuracy is a common evaluation metric that is often used in data mining studies• Simple and easy to measure, but with known problems

Page 7: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

The Confusion Matrix

• True classes• Positive (p), Negative (n)

• Predicted classes• Yes (Y), No (N)

• Confusion matrix• A n × n matrix with the true classes and predicted

classes cross aligned

Page 8: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Problems with Unbalanced Classes

• Two different cases

Page 9: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Problems with Unequal Costs and Benefits

• Simple classification accuracy does not distinguish false positive from false negative errors

• In some application areas, they are not equally important.• Cancer detection

• Spaceship designvs. • Spam filter

Page 10: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Generalizing beyond Classification

• Aligning modeling results in line with the business goal

• Expected Value• Weighted average of the values of the different possible

outcomes, where the weight given to each value is its probability of occurrence.

Page 11: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Using Expected Value to Frame Classifier Use

• For target marketing, we would like to assign each consumer a class of likely responder vs. not likely responder, so that we can target the likely responders

• Expected value provides a framework for carrying out the analysis.

• The expected benefit of targeting consumer x:

• Example: A consumer buys the product for $200 and our product related costs are $100. To mail marketing materials, the overall cost is $1. Which customer should we target?

Page 12: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Using Expected Value to Frame Classifier Evaluation

Page 13: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Using Expected Value to Frame Classifier Evaluation: An Example

• A sample confusion matrix with counts

• From the confusion matrix, we can compute rates or estimated probabilities, p(h,a).

Page 14: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Using Expected Value to Frame Classifier Evaluation: An Example

• A cost-benefit matrix • From the target marketing example

Page 15: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Using Expected Value to Frame Classifier Evaluation: An Example

• Expected profit

Page 16: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Using Expected Value to Frame Classifier Evaluation: An Example

Page 17: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Baseline Performance

• It is important to consider carefully what would be a reasonable baseline against which to compare model performance.• Shows performance improvement• Demonstrates that the modeling process has added value

• General guidelines for good baselines• Majority classifier• Multiple simple averages• Simple conditional model

• Single-feature predictive model

Page 18: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Part II: Toward Analytic Engineering

Page 19: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Targeting the Best Prospects for a Charity Mailing

• Goal: Maximize the donation profit (net income)

• To solve this problem, we can use the expected value framework

• Expected benefit of targeting

• When the value varies from consumer to consumer:

Page 20: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Targeting the Best Prospects for a Charity Mailing

• Assume that the benefit from no-response is zero:

• As we want the benefit to be greater than zero:

Page 21: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

Example Revisited With More Sophistication

• Assessing the influence of the incentive: Expected benefit of targeting vs. not targeting

Page 22: Data Science for Business - KAISTkseworkshop.kaist.ac.kr/2014/material/2014KSE-4.pdf · 2014. 2. 28. · Data Science for Business. Data Science for Business. 의사결정분석적사고

한국과학기술원지식서비스공학과

이문용

[email protected] (Office)

010-2781-1613 (Mobile)

감사합니다