商業智慧實務 (practices of business...

159
商業智慧實務 Practices of Business Intelligence 1 1032BI05 MI4 Wed, 9,10 (16:10-18:00) (B130) 商業智慧的資料探勘 (Data Mining for Business Intelligence) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University 淡江大學 資訊管理學系 http://mail. tku.edu.tw/myday/ 2015-03-25 Tamkang University

Upload: others

Post on 09-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

商業智慧實務 Practices of Business Intelligence

1

1032BI05 MI4

Wed, 9,10 (16:10-18:00) (B130)

商業智慧的資料探勘 (Data Mining for Business Intelligence)

Min-Yuh Day

戴敏育 Assistant Professor

專任助理教授 Dept. of Information Management, Tamkang University

淡江大學 資訊管理學系

http://mail. tku.edu.tw/myday/

2015-03-25

Tamkang University

Page 2: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

週次 (Week) 日期 (Date) 內容 (Subject/Topics)

1 2015/02/25 商業智慧導論 (Introduction to Business Intelligence)

2 2015/03/04 管理決策支援系統與商業智慧 (Management Decision Support System and Business Intelligence)

3 2015/03/11 企業績效管理 (Business Performance Management)

4 2015/03/18 資料倉儲 (Data Warehousing)

5 2015/03/25 商業智慧的資料探勘 (Data Mining for Business Intelligence)

6 2015/04/01 教學行政觀摩日 (Off-campus study)

7 2015/04/08 商業智慧的資料探勘 (Data Mining for Business Intelligence)

8 2015/04/15 資料科學與巨量資料分析 (Data Science and Big Data Analytics)

課程大綱 (Syllabus)

2

Page 3: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

週次 日期 內容(Subject/Topics)

9 2015/04/22 期中報告 (Midterm Project Presentation)

10 2015/04/29 期中考試週 (Midterm Exam)

11 2015/05/06 文字探勘與網路探勘 (Text and Web Mining)

12 2015/05/13 意見探勘與情感分析 (Opinion Mining and Sentiment Analysis)

13 2015/05/20 社會網路分析 (Social Network Analysis)

14 2015/05/27 期末報告 (Final Project Presentation)

15 2015/06/03 畢業考試週 (Final Exam)

課程大綱 (Syllabus)

3

Page 4: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Business Intelligence Data Mining, Data Warehouses

Increasing potential

to support

business decisions End User

Business

Analyst

Data

Analyst

DBA

Decision Making

Data Presentation

Visualization Techniques

Data Mining Information Discovery

Data Exploration

Statistical Summary, Querying, and Reporting

Data Preprocessing/Integration, Data Warehouses

Data Sources

Paper, Files, Web documents, Scientific experiments, Database Systems

4 Source: Han & Kamber (2006)

Page 5: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Data Mining at the Intersection of Many Disciplines

Sta

tistic

s

Management Science &

Information Systems

Artificia

l Inte

lligence

Databases

Pattern

Recognition

Machine

Learning

Mathematical

Modeling

DATA

MINING

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 5

Page 6: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

A Taxonomy for Data Mining Tasks Data Mining

Prediction

Classification

Regression

Clustering

Association

Link analysis

Sequence analysis

Learning Method Popular Algorithms

Supervised

Supervised

Supervised

Unsupervised

Unsupervised

Unsupervised

Unsupervised

Decision trees, ANN/MLP, SVM, Rough

sets, Genetic Algorithms

Linear/Nonlinear Regression, Regression

trees, ANN/MLP, SVM

Expectation Maximization, Apriory

Algorithm, Graph-based Matching

Apriory Algorithm, FP-Growth technique

K-means, ANN/SOM

Outlier analysis Unsupervised K-means, Expectation Maximization (EM)

Apriory, OneR, ZeroR, Eclat

Classification and Regression Trees,

ANN, SVM, Genetic Algorithms

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 6

Page 7: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Data Mining Software

• Commercial – SPSS - PASW (formerly

Clementine)

– SAS - Enterprise Miner

– IBM - Intelligent Miner

– StatSoft – Statistical Data Miner

– … many more

• Free and/or Open Source – Weka

– RapidMiner…

0 20 40 60 80 100 120

Thinkanalytics

Miner3D

Clario Analytics

Viscovery

Megaputer

Insightful Miner/S-Plus (now TIBCO)

Bayesia

C4.5, C5.0, See5

Angoss

Orange

Salford CART, Mars, other

Statsoft Statistica

Oracle DM

Zementis

Other free tools

Microsoft SQL Server

KNIME

Other commercial tools

MATLAB

KXEN

Weka (now Pentaho)

Your own code

R

Microsoft Excel

SAS / SAS Enterprise Miner

RapidMiner

SPSS PASW Modeler (formerly Clementine)

Total (w/ others) Alone

Source: KDNuggets.com, May 2009

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Page 8: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Data Mining Process

• A manifestation of best practices

• A systematic way to conduct DM projects

• Different groups has different versions

• Most common standard processes:

– CRISP-DM (Cross-Industry Standard Process for Data Mining)

– SEMMA (Sample, Explore, Modify, Model, and Assess)

– KDD (Knowledge Discovery in Databases)

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 8

Page 9: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Data Mining Process: CRISP-DM

Data Sources

Business

Understanding

Data

Preparation

Model

Building

Testing and

Evaluation

Deployment

Data

Understanding

6

1 2

3

5

4

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 9

Page 10: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Data Mining Process: CRISP-DM

Step 1: Business Understanding

Step 2: Data Understanding

Step 3: Data Preparation (!)

Step 4: Model Building

Step 5: Testing and Evaluation

Step 6: Deployment

• The process is highly repetitive and experimental (DM: art versus science?)

Accounts for

~85% of total

project time

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 10

Page 11: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Data Preparation – A Critical DM Task

Data Consolidation

Data Cleaning

Data Transformation

Data Reduction

Well-formed

Data

Real-world

Data

· Collect data

· Select data

· Integrate data

· Impute missing values

· Reduce noise in data

· Eliminate inconsistencies

· Normalize data

· Discretize/aggregate data

· Construct new attributes

· Reduce number of variables

· Reduce number of cases

· Balance skewed data

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 11

Page 12: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Data Mining Process: SEMMA

Sample

(Generate a representative

sample of the data)

Modify(Select variables, transform

variable representations)

Explore(Visualization and basic

description of the data)

Model(Use variety of statistical and

machine learning models )

Assess(Evaluate the accuracy and

usefulness of the models)

SEMMA

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 12

Page 13: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Evaluation

(Accuracy of Classification Model)

13

Page 14: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Estimation Methodologies for Classification

• Simple split (or holdout or test sample estimation)

– Split the data into 2 mutually exclusive sets training (~70%) and testing (30%)

– For ANN, the data is split into three sub-sets (training [~60%], validation [~20%], testing [~20%])

Preprocessed

Data

Training Data

Testing Data

Model

Development

Model

Assessment

(scoring)

2/3

1/3

Classifier

Prediction

Accuracy

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 14

Page 15: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Estimation Methodologies for Classification

• k-Fold Cross Validation (rotation estimation)

– Split the data into k mutually exclusive subsets

– Use each subset as testing while using the rest of the subsets as training

– Repeat the experimentation for k times

– Aggregate the test results for true estimation of prediction accuracy training

• Other estimation methodologies

– Leave-one-out, bootstrapping, jackknifing

– Area under the ROC curve

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 15

Page 16: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Assessment Methods for Classification

• Predictive accuracy

– Hit rate

• Speed

– Model building; predicting

• Robustness

• Scalability

• Interpretability

– Transparency, explainability

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 16

Page 17: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Accuracy

Precision

17

Validity

Reliability

Page 18: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

18

Page 19: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Accuracy vs. Precision

19

High Accuracy

High Precision

High Accuracy

Low Precision

Low Accuracy

High Precision

Low Accuracy

Low Precision

A B

C D

Page 20: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Accuracy vs. Precision

20

High Accuracy

High Precision

High Accuracy

Low Precision

Low Accuracy

High Precision

Low Accuracy

Low Precision

A B

C D

High Validity

High Reliability

High Validity

Low Reliability Low Validity

Low Reliability

Low Validity

High Reliability

Page 21: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Accuracy vs. Precision

21

High Accuracy

High Precision

High Accuracy

Low Precision

Low Accuracy

High Precision

Low Accuracy

Low Precision

A B

C D

High Validity

High Reliability

High Validity

Low Reliability Low Validity

Low Reliability

Low Validity

High Reliability

Page 22: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Accuracy of Classification Models • In classification problems, the primary source for

accuracy estimation is the confusion matrix

True

Positive

Count (TP)

False

Positive

Count (FP)

True

Negative

Count (TN)

False

Negative

Count (FN)

True Class

Positive Negative

Po

sitiv

eN

eg

ative

Pre

dic

ted

Cla

ss FNTP

TPRatePositiveTrue

FPTN

TNRateNegativeTrue

FNFPTNTP

TNTPAccuracy

FPTP

TPrecision

P

FNTP

TPcallRe

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 22

Page 23: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Estimation Methodologies for Classification – ROC Curve

10.90.80.70.60.50.40.30.20.10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1

0.9

0.8

False Positive Rate (1 - Specificity)

Tru

e P

osi

tive

Ra

te (

Se

nsi

tivity

) A

B

C

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 23

Page 24: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Sensitivity

Specificity

24

=True Positive Rate

=True Negative Rate

Page 25: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

True Positive

(TP)

False Negative

(FN)

False Positive

(FP)

True Negative

(TN)

True Class

(actual value)

Pre

dic

tive C

lass

(pre

dic

tio

n o

utc

om

e)

Positiv

e

Nega

tive

Positive Negative

total P

total

N

N’

P’

25

FNTP

TPRatePositiveTrue

FPTN

TNRateNegativeTrue

FNFPTNTP

TNTPAccuracy

FPTP

TPrecision

P

FNTP

TPcallRe

FNTP

TPRatePositiveTrue

ty)(Sensitivi

10.90.80.70.60.50.40.30.20.10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1

0.9

0.8

False Positive Rate (1 - Specificity)

Tru

e P

osi

tive

Ra

te (

Se

nsi

tivity

) A

B

C

TNFP

FPRatePositiveF

alse

FPTN

TNRateNegativeTrue

ty)(Specifici

TNFP

FPRatePositiveF

y)Specificit-(1 alse

Source: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

Page 26: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

True Positive

(TP)

False Negative

(FN)

False Positive

(FP)

True Negative

(TN)

True Class

(actual value)

Pre

dic

tive C

lass

(pre

dic

tio

n o

utc

om

e)

Positiv

e

Nega

tive

Positive Negative

total P

total

N

N’

P’

26

FNTP

TPRatePositiveTrue

FNTP

TPcallRe

FNTP

TPRatePositiveTrue

ty)(Sensitivi

10.90.80.70.60.50.40.30.20.10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1

0.9

0.8

False Positive Rate (1 - Specificity)

Tru

e P

osi

tive

Ra

te (

Se

nsi

tivity

) A

B

C

Sensitivity

= True Positive Rate

= Recall

= Hit rate

= TP / (TP + FN) Source: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

Page 27: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

True Positive

(TP)

False Negative

(FN)

False Positive

(FP)

True Negative

(TN)

True Class

(actual value)

Pre

dic

tive C

lass

(pre

dic

tio

n o

utc

om

e)

Positiv

e

Nega

tive

Positive Negative

total P

total

N

N’

P’

27

FPTN

TNRateNegativeTrue

10.90.80.70.60.50.40.30.20.10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1

0.9

0.8

False Positive Rate (1 - Specificity)

Tru

e P

osi

tive

Ra

te (

Se

nsi

tivity

) A

B

C

FPTN

TNRateNegativeTrue

ty)(Specifici

TNFP

FPRatePositiveF

y)Specificit-(1 alse

Specificity

= True Negative Rate

= TN / N

= TN / (TN+ FP)

Source: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

Page 28: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

True Positive

(TP)

False Negative

(FN)

False Positive

(FP)

True Negative

(TN)

True Class

(actual value)

Pre

dic

tive C

lass

(pre

dic

tio

n o

utc

om

e)

Positiv

e

Nega

tive

Positive Negative

total P

total

N

N’

P’

28

FPTP

TPrecision

P

FNTP

TPcallRe

F1 score (F-score)(F-measure)

is the harmonic mean of

precision and recall

= 2TP / (P + P’)

= 2TP / (2TP + FP + FN)

Precision

= Positive Predictive Value (PPV)

Recall

= True Positive Rate (TPR)

= Sensitivity

= Hit Rate

recallprecision

recallprecisionF

**2

Source: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

Page 29: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

29 Source: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

A

63 (TP)

37 (FN)

28 (FP)

72 (TN)

100 100

109

91

200

TPR = 0.63

FPR = 0.28

PPV = 0.69 =63/(63+28)

=63/91

F1 = 0.66

= 2*(0.63*0.69)/(0.63+0.69)

= (2 * 63) /(100 + 91)

= (0.63 + 0.69) / 2 =1.32 / 2 =0.66

ACC = 0.68 = (63 + 72) / 200

= 135/200 = 0.675

FPTP

TPrecision

P

FNTP

TPcallRe

F1 score (F-score)

(F-measure)

is the harmonic mean of

precision and recall

= 2TP / (P + P’)

= 2TP / (2TP + FP + FN)

Precision

= Positive Predictive Value (PPV)

Recall

= True Positive Rate (TPR)

= Sensitivity

= Hit Rate

= TP / (TP + FN)

recallprecision

recallprecisionF

**2

FNFPTNTP

TNTPAccuracy

FPTN

TNRateNegativeTrue

ty)(Specifici

TNFP

FPRatePositiveF

y)Specificit-(1 alse

Specificity

= True Negative Rate

= TN / N

= TN / (TN + FP)

Page 30: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

30 Source: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

A

63 (TP)

37 (FN)

28 (FP)

72 (TN)

100 100

109

91

200

TPR = 0.63

FPR = 0.28

PPV = 0.69 =63/(63+28)

=63/91

F1 = 0.66

= 2*(0.63*0.69)/(0.63+0.69)

= (2 * 63) /(100 + 91)

= (0.63 + 0.69) / 2 =1.32 / 2 =0.66

ACC = 0.68 = (63 + 72) / 200

= 135/200 = 0.675

B

77 (TP)

23 (FN)

77 (FP)

23 (TN)

100 100

46

154

200

TPR = 0.77

FPR = 0.77

PPV = 0.50

F1 = 0.61

ACC = 0.50

FNTP

TPcallRe

Recall

= True Positive Rate (TPR)

= Sensitivity

= Hit Rate

Precision

= Positive Predictive Value (PPV) FPTP

TPrecision

P

Page 31: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

31

C’

76 (TP)

24 (FN)

12 (FP)

88 (TN)

100 100

112

88

200

TPR = 0.76

FPR = 0.12

PPV = 0.86

F1 = 0.81

ACC = 0.82

C

24 (TP)

76 (FN)

88 (FP)

12 (TN)

100 100

88

112

200

TPR = 0.24

FPR = 0.88

PPV = 0.21

F1 = 0.22

ACC = 0.18

Source: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

FNTP

TPcallRe

Recall

= True Positive Rate (TPR)

= Sensitivity

= Hit Rate

Precision

= Positive Predictive Value (PPV) FPTP

TPrecision

P

Page 32: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Cluster Analysis

• Used for automatic identification of natural groupings of things

• Part of the machine-learning family

• Employ unsupervised learning

• Learns the clusters of things from past data, then assigns new instances

• There is not an output variable

• Also known as segmentation

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 32

Page 33: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Cluster Analysis

33

Clustering of a set of objects based on the k-means method.

(The mean of each cluster is marked by a “+”.)

Source: Han & Kamber (2006)

Page 34: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Cluster Analysis

• Clustering results may be used to

– Identify natural groupings of customers

– Identify rules for assigning new cases to classes for targeting/diagnostic purposes

– Provide characterization, definition, labeling of populations

– Decrease the size and complexity of problems for other data mining methods

– Identify outliers in a specific domain (e.g., rare-event detection)

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 34

Page 35: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

35

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

Point P P(x,y)

p01 a (3, 4)

p02 b (3, 6)

p03 c (3, 8)

p04 d (4, 5)

p05 e (4, 7)

p06 f (5, 1)

p07 g (5, 5)

p08 h (7, 3)

p09 i (7, 5)

p10 j (8, 5)

Example of Cluster Analysis

Page 36: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Cluster Analysis for Data Mining

• How many clusters?

– There is not a “truly optimal” way to calculate it

– Heuristics are often used 1. Look at the sparseness of clusters

2. Number of clusters = (n/2)1/2 (n: no of data points)

3. Use Akaike information criterion (AIC)

4. Use Bayesian information criterion (BIC)

• Most cluster analysis methods involve the use of a distance measure to calculate the closeness between pairs of items

– Euclidian versus Manhattan (rectilinear) distance

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 36

Page 37: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

k-Means Clustering Algorithm

• k : pre-determined number of clusters

• Algorithm (Step 0: determine value of k)

Step 1: Randomly generate k random points as initial cluster centers

Step 2: Assign each point to the nearest cluster center

Step 3: Re-compute the new cluster centers

Repetition step: Repeat steps 2 and 3 until some convergence criterion is met (usually that the assignment of points to clusters becomes stable)

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 37

Page 38: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Cluster Analysis for Data Mining - k-Means Clustering Algorithm

Step 1 Step 2 Step 3

Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 38

Page 39: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Similarity and Dissimilarity Between Objects

• Distances are normally used to measure the similarity or

dissimilarity between two data objects

• Some popular ones include: Minkowski distance:

where i = (xi1, xi2, …, xip) and j = (xj1, xj2, …, xjp) are two p-

dimensional data objects, and q is a positive integer

• If q = 1, d is Manhattan distance

qq

pp

qq

jx

ix

jx

ix

jx

ixjid )||...|||(|),(

2211

||...||||),(2211 pp j

xi

xj

xi

xj

xi

xjid

39 Source: Han & Kamber (2006)

Page 40: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Similarity and Dissimilarity Between Objects (Cont.)

• If q = 2, d is Euclidean distance:

– Properties

• d(i,j) 0

• d(i,i) = 0

• d(i,j) = d(j,i)

• d(i,j) d(i,k) + d(k,j)

• Also, one can use weighted distance, parametric Pearson

product moment correlation, or other disimilarity measures

)||...|||(|),(22

22

2

11 pp jx

ix

jx

ix

jx

ixjid

40 Source: Han & Kamber (2006)

Page 41: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Euclidean distance vs Manhattan distance

• Distance of two point x1 = (1, 2) and x2 (3, 5)

41

1 2 3

1

2

3

4

5 x2 (3, 5)

2 x1 = (1, 2)

3 3.61

Euclidean distance:

= ((3-1)2 + (5-2)2 )1/2

= (22 + 32)1/2

= (4 + 9)1/2

= (13)1/2

= 3.61

Manhattan distance:

= (3-1) + (5-2)

= 2 + 3

= 5

Page 42: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

The K-Means Clustering Method

• Example

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10 0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

K=2

Arbitrarily choose K

object as initial

cluster center

Assign

each

objects

to most

similar

center

Update

the

cluster

means

Update

the

cluster

means

reassign reassign

42 Source: Han & Kamber (2006)

Page 43: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

43

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

Point P P(x,y)

p01 a (3, 4)

p02 b (3, 6)

p03 c (3, 8)

p04 d (4, 5)

p05 e (4, 7)

p06 f (5, 1)

p07 g (5, 5)

p08 h (7, 3)

p09 i (7, 5)

p10 j (8, 5)

K-Means Clustering Step by Step

Page 44: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

44

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

Point P P(x,y)

p01 a (3, 4)

p02 b (3, 6)

p03 c (3, 8)

p04 d (4, 5)

p05 e (4, 7)

p06 f (5, 1)

p07 g (5, 5)

p08 h (7, 3)

p09 i (7, 5)

p10 j (8, 5)

Initial m1 (3, 4)

Initial m2 (8, 5)

m1 = (3, 4)

M2 = (8, 5)

K-Means Clustering Step 1: K=2, Arbitrarily choose K object as initial cluster center

Page 45: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

45

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

M2 = (8, 5)

Step 2: Compute seed points as the centroids of the clusters of the current partition

Step 3: Assign each objects to most similar center

m1 = (3, 4)

K-Means Clustering

Point P P(x,y) m1

distance

m2

distance Cluster

p01 a (3, 4) 0.00 5.10 Cluster1

p02 b (3, 6) 2.00 5.10 Cluster1

p03 c (3, 8) 4.00 5.83 Cluster1

p04 d (4, 5) 1.41 4.00 Cluster1

p05 e (4, 7) 3.16 4.47 Cluster1

p06 f (5, 1) 3.61 5.00 Cluster1

p07 g (5, 5) 2.24 3.00 Cluster1

p08 h (7, 3) 4.12 2.24 Cluster2

p09 i (7, 5) 4.12 1.00 Cluster2

p10 j (8, 5) 5.10 0.00 Cluster2

Initial m1 (3, 4)

Initial m2 (8, 5)

Page 46: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

46

Point P P(x,y) m1

distance

m2

distance Cluster

p01 a (3, 4) 0.00 5.10 Cluster1

p02 b (3, 6) 2.00 5.10 Cluster1

p03 c (3, 8) 4.00 5.83 Cluster1

p04 d (4, 5) 1.41 4.00 Cluster1

p05 e (4, 7) 3.16 4.47 Cluster1

p06 f (5, 1) 3.61 5.00 Cluster1

p07 g (5, 5) 2.24 3.00 Cluster1

p08 h (7, 3) 4.12 2.24 Cluster2

p09 i (7, 5) 4.12 1.00 Cluster2

p10 j (8, 5) 5.10 0.00 Cluster2

Initial m1 (3, 4)

Initial m2 (8, 5)

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

M2 = (8, 5)

Step 2: Compute seed points as the centroids of the clusters of the current partition

Step 3: Assign each objects to most similar center

m1 = (3, 4)

K-Means Clustering

Euclidean distance

b(3,6) m2(8,5)

= ((8-3)2 + (5-6)2 )1/2

= (52 + (-1)2)1/2

= (25 + 1)1/2

= (26)1/2

= 5.10

Euclidean distance

b(3,6) m1(3,4)

= ((3-3)2 + (4-6)2 )1/2

= (02 + (-2)2)1/2

= (0 + 4)1/2

= (4)1/2

= 2.00

Page 47: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

47

Point P P(x,y) m1

distance

m2

distance Cluster

p01 a (3, 4) 1.43 4.34 Cluster1

p02 b (3, 6) 1.22 4.64 Cluster1

p03 c (3, 8) 2.99 5.68 Cluster1

p04 d (4, 5) 0.20 3.40 Cluster1

p05 e (4, 7) 1.87 4.27 Cluster1

p06 f (5, 1) 4.29 4.06 Cluster2

p07 g (5, 5) 1.15 2.42 Cluster1

p08 h (7, 3) 3.80 1.37 Cluster2

p09 i (7, 5) 3.14 0.75 Cluster2

p10 j (8, 5) 4.14 0.95 Cluster2

m1 (3.86, 5.14)

m2 (7.33, 4.33)

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

m1 = (3.86, 5.14)

M2 = (7.33, 4.33)

Step 4: Update the cluster means,

Repeat Step 2, 3,

stop when no more new assignment

K-Means Clustering

Page 48: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

48

Point P P(x,y) m1

distance

m2

distance Cluster

p01 a (3, 4) 1.95 3.78 Cluster1

p02 b (3, 6) 0.69 4.51 Cluster1

p03 c (3, 8) 2.27 5.86 Cluster1

p04 d (4, 5) 0.89 3.13 Cluster1

p05 e (4, 7) 1.22 4.45 Cluster1

p06 f (5, 1) 5.01 3.05 Cluster2

p07 g (5, 5) 1.57 2.30 Cluster1

p08 h (7, 3) 4.37 0.56 Cluster2

p09 i (7, 5) 3.43 1.52 Cluster2

p10 j (8, 5) 4.41 1.95 Cluster2

m1 (3.67, 5.83)

m2 (6.75, 3.50)

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

M2 = (6.75., 3.50)

m1 = (3.67, 5.83)

Step 4: Update the cluster means,

Repeat Step 2, 3,

stop when no more new assignment

K-Means Clustering

Page 49: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

49

Point P P(x,y) m1

distance

m2

distance Cluster

p01 a (3, 4) 1.95 3.78 Cluster1

p02 b (3, 6) 0.69 4.51 Cluster1

p03 c (3, 8) 2.27 5.86 Cluster1

p04 d (4, 5) 0.89 3.13 Cluster1

p05 e (4, 7) 1.22 4.45 Cluster1

p06 f (5, 1) 5.01 3.05 Cluster2

p07 g (5, 5) 1.57 2.30 Cluster1

p08 h (7, 3) 4.37 0.56 Cluster2

p09 i (7, 5) 3.43 1.52 Cluster2

p10 j (8, 5) 4.41 1.95 Cluster2

m1 (3.67, 5.83)

m2 (6.75, 3.50)

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

stop when no more new assignment

K-Means Clustering

Page 50: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

K-Means Clustering (K=2, two clusters)

50

Point P P(x,y) m1

distance

m2

distance Cluster

p01 a (3, 4) 1.95 3.78 Cluster1

p02 b (3, 6) 0.69 4.51 Cluster1

p03 c (3, 8) 2.27 5.86 Cluster1

p04 d (4, 5) 0.89 3.13 Cluster1

p05 e (4, 7) 1.22 4.45 Cluster1

p06 f (5, 1) 5.01 3.05 Cluster2

p07 g (5, 5) 1.57 2.30 Cluster1

p08 h (7, 3) 4.37 0.56 Cluster2

p09 i (7, 5) 3.43 1.52 Cluster2

p10 j (8, 5) 4.41 1.95 Cluster2

m1 (3.67, 5.83)

m2 (6.75, 3.50)

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

stop when no more new assignment

K-Means Clustering

Page 51: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

個案分析與實作一 (SAS EM 分群分析): Case Study 1 (Cluster Analysis – K-Means using SAS EM)

Banking Segmentation

51

Page 52: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

行銷客戶分群

52

Page 53: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

53

案例情境 • ABC銀行的行銷部門想要針對該銀行客戶的使用行為,進行分群分析,以了解現行客戶對本行的往來方式,並進一步提供適宜的行銷接觸模式。

• 該銀行從有效戶(近三個月有交易者), 取出10萬筆樣本資料。 依下列四種交易管道計算交易次數:

–傳統臨櫃交易(TBM)

–自動櫃員機交易(ATM)

–銀行專員服務(POS)

–電話客服(CSC) Source: SAS Enterprise Miner Course Notes, 2014, SAS

Page 54: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

54

資料欄位說明 • 資料集名稱: profile.sas7bdat

Source: SAS Enterprise Miner Course Notes, 2014, SAS

Page 55: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

55

• 分析目的

依據各往來交易管道TBM、ATM、POS、CSC進行客戶分群分析。

演練重點:

• 極端值資料處理 • 分群變數選擇 • 衍生變數產出 • 分群參數調整與分群結果解釋

行銷客戶分群實機演練

Source: SAS Enterprise Miner Course Notes, 2014, SAS

Page 56: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Miner (SAS EM) Case Study

• SAS EM 資料匯入4步驟

– Step 1. 新增專案 (New Project)

– Step 2. 新增資料館 (New / Library)

– Step 3. 建立資料來源 (Create Data Source)

– Step 4. 建立流程圖 (Create Diagram)

• SAS EM SEMMA 建模流程

56

Page 57: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

57

Download EM_Data.zip (SAS EM Datasets)

http://mail.tku.edu.tw/myday/teaching/1022/DM/Data/EM_Data.zip

Page 58: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Upzip EM_Data.zip to C:\DATA\EM_Data

58

Page 59: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Upzip EM_Data.zip to C:\DATA\EM_Data

59

Page 60: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

VMware Horizon View Client softcloud.tku.edu.tw SAS Enterprise Miner

60

Page 61: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Locale Setup Manager English UI

61

Page 62: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Guide 5.1 (SAS EG)

62

Page 63: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Miner 12.1 (SAS EM)

63

Page 64: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Locale Setup Manager 3.1

64

Page 65: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

65 Source: http://www.sasresource.com/faq11.html

Page 66: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

66

C:\Program Files\SASHome\SASEnterpriseMinerWorkstationConfiguration\12.1\em.exe

Page 67: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

67

Page 68: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

68

Page 69: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

69

Page 70: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Miner 12.1 (SAS EM)

70

Page 71: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Miner (SAS EM)

71

SAS Enterprise Miner (SAS EM) English UI

Page 72: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Guide 5.1 (SAS EG)

Open SAS .sas7bdat File Export to Excel .xlsx File

Import Excel .xlsx File Export to SAS .sas7bdat File

72

Page 73: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Guide 5.1 (SAS EG)

73

Page 74: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

New Project

74

Page 75: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Open SAS Data File: Profile.sas7bdat

75

Page 76: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

76

Page 77: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

77

Page 78: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

78

Page 79: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

79

Page 80: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

80

Page 81: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

81

Page 82: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

82

Page 83: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

83

Page 84: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

84

Page 85: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

85

Page 86: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

86

Export SAS .sas7bdat to to Excel .xlsx File

Page 87: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Import Excel File to SAS EG

87

Page 88: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

88

Import Excel File to SAS EG

Page 89: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

89

Page 90: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

90

Page 91: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

91

Page 92: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

92

Page 93: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

93

Page 94: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

94

Page 95: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

95

Page 96: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Export Excel .xlsx File to SAS .sas7bdat File

96

Page 97: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

97

Page 98: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

98

Page 99: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

99

Page 100: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Profile_Excel.xlsx

100

Page 101: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Profile_SAS.sas7bdat

101

Page 102: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Miner 12.1 (SAS EM)

102

Page 103: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS EM 資料匯入4步驟

• Step 1. 新增專案 (New Project)

• Step 2. 新增資料館 (New / Library)

• Step 3. 建立資料來源 (Create Data Source)

• Step 4. 建立流程圖 (Create Diagram)

103

Page 104: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Step 1. 新增專案 (New Project)

104

Page 105: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

105

Step 1. 新增專案 (New Project)

Page 106: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

106

Step 1. 新增專案 (New Project)

Page 107: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Miner (EM_Project1)

107

Page 108: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Step 2. 新增資料館 (New / Library)

108

Page 109: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

109

Step 2. 新增資料館 (New / Library)

Page 110: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

110

Step 2. 新增資料館 (New / Library)

Page 111: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

111

Step 2. 新增資料館 (New / Library)

Page 112: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

112

Step 2. 新增資料館 (New / Library)

Page 113: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

Step 3. 建立資料來源 (Create Data Source)

113

Page 114: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

114

Step 3. 建立資料來源 (Create Data Source)

Page 115: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

115

Step 3. 建立資料來源 (Create Data Source)

Page 116: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

116

Step 3. 建立資料來源 (Create Data Source)

Page 117: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

117

Step 3. 建立資料來源 (Create Data Source)

Page 118: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

118

EM_LIB.PROFILE

Step 3. 建立資料來源 (Create Data Source)

Page 119: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

119

Step 3. 建立資料來源 (Create Data Source)

Page 120: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

120

Step 3. 建立資料來源 (Create Data Source)

Page 121: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

121

Step 3. 建立資料來源 (Create Data Source)

Page 122: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

122

Step 3. 建立資料來源 (Create Data Source)

Page 123: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

123

Step 3. 建立資料來源 (Create Data Source)

Page 124: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

124

Step 3. 建立資料來源 (Create Data Source)

Page 125: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

125

Step 3. 建立資料來源 (Create Data Source)

Page 126: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

126

Step 3. 建立資料來源 (Create Data Source)

Page 127: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

127

Step 3. 建立資料來源 (Create Data Source)

Page 128: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

128

Step 4. 建立流程圖 (Create Diagram)

Page 129: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

129

Step 4. 建立流程圖 (Create Diagram)

Page 130: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

130

Step 4. 建立流程圖 (Create Diagram)

Page 131: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

SAS Enterprise Miner (SAS EM) Case Study

• SAS EM 資料匯入4步驟

– Step 1. 新增專案 (New Project)

– Step 2. 新增資料館 (New / Library)

– Step 3. 建立資料來源 (Create Data Source)

– Step 4. 建立流程圖 (Create Diagram)

• SAS EM SEMMA 建模流程

131

Page 132: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

132

案例情境模型流程

Page 133: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

133

Page 134: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

EM_Lib.Profile

134

Page 135: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

135

Page 136: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

136

Page 137: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

137

Page 138: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

138

Page 139: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

139

Page 140: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

篩選-間隔變數 …

140

Page 141: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

141

141

Page 142: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

篩選-匯入的資料 … [勘查]

142

Page 143: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

143

篩選-匯入的資料

Page 144: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

144

篩選-匯出的資料 … [勘查]

Page 145: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

145

篩選-匯出的資料

Page 146: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

勘查 (Explore)-群集 (Cluster)

146

Page 147: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

147

Page 148: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

148

Page 149: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

149

Page 150: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

150

Page 151: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

群集 (Cluster) 結果

151

Page 152: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

評估 (Assess)-區段特徵描繪(Segment Profile)

152

區段特徵描繪 (Segment Profile)

Page 153: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

153

評估 (Assess)-區段特徵描繪(Segment Profile)

Page 154: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

154

評估 (Assess)-區段特徵描繪(Segment Profile)

Page 155: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

155

評估 (Assess)-區段特徵描繪(Segment Profile)

Page 156: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

156

評估 (Assess)-區段特徵描繪(Segment Profile)

Page 157: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

區段特徵描繪 (Segment Profile)

157

Page 158: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

158

區段特徵描繪 (Segment Profile)

Page 159: 商業智慧實務 (Practices of Business Intelligence)mail.tku.edu.tw/myday/teaching/1032/BI/1032BI05_Business_Intelligence.pdf · 商業智慧實務 Practices of Business Intelligence

References • Efraim Turban, Ramesh Sharda, Dursun Delen,

Decision Support and Business Intelligence Systems, Ninth Edition, 2011, Pearson.

• Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Second Edition, 2006, Elsevier

• Jim Georges, Jeff Thompson and Chip Wells, Applied Analytics Using SAS Enterprise Miner, SAS, 2010

• SAS Enterprise Miner Course Notes, 2014, SAS

• SAS Enterprise Miner Training Course, 2014, SAS

• SAS Enterprise Guide Training Course, 2014, SAS

159