2003_121 install and use analysis manager to create decision trees 과 목 data mining 교 수 용...

21
2003_12 1 Install and Use Analysis Manager to create Decision Trees Data Mining 과 과 과 과과과 과과과

Upload: madeleine-francis

Post on 21-Jan-2016

236 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 1

Install and Use Analysis Manager to create Decision Trees

과 목 Data Mining

교 수 용 환 승

작성자 김지영

Page 2: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 2

목 차

Data

Process

Result

Page 3: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 3

1. Data

http://kdd.ics.uci.edu/ - Cover Type The forest cover type for 30 x 30 meter cells obtained from US Forest Service (USFS) Region 2 Resource Information System (RIS) data.

Data CharacteristicsThe actual forest cover type for a given observation (30 x 30 meter cell) was determined from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. Independent variables were derived from data originally obtained from US Geological Survey (USGS) and USFS data. Data is in raw form (not scaled) and contains binary (0 or 1) columns of data for qualitative independent variables (wilderness areas and soil types).

Summary StatisticsNumber of instances (observations) 9999Number of Attributes 54Attribute breakdown

12 measures, but 54 columns of data (10 quantitative variables, 4 binary wilderness areas and 40 binary soil type variables)

Missing Attribute Values None

Page 4: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 4

1. Data

Variable Information

Original Name Colunm Name Data Type Measurement Description

seq seq integer Unique KeyElevation Elevation quantitative meters Elevation in meters

Aspect Aspect quantitative azimuth Aspect in degrees azimuth

Slope Slope quantitative degrees Slope in degrees

Horizontal_Distance_To_Hydrology Hor_Dist_To_Hydro quantitative meters Horz Dist to nearest surface water features

Vertical_Distance_To_Hydrology Ver_Dist_To_Hydro quantitative meters Vert Dist to nearest surface water features

Horizontal_Distance_To_Roadways Hor_Dist_To_Road quantitative meters Horz Dist to nearest roadway

Hillshade_9am Hillshade_9am quantitative 0 to 255 index Hillshade index at 9am, summer solstice

Hillshade_Noon Hillshade_Noon quantitative 0 to 255 index Hillshade index at noon, summer soltice

Hillshade_3pm Hillshade_3pm quantitative 0 to 255 index Hillshade index at 3pm, summer solstice

Horizontal_Distance_To_Fire_Points Hor_Dist_To_FP quantitative meters Horz Dist to nearest wildfire ignition points

Wilderness_Area (4 binary columns) Wilder ( * 4) quantitative 0 (absence) or 1 (presence) Wilderness area designation

Soil_Type (40 binary columns) Soil ( * 40) quantitative 0 (absence) or 1 (presence) Soil Type designation

Cover_Type (7 types) Cover integer 1 to 7 Forest Cover Type designation

Page 5: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 5

1. Data

Code Designations<Wilderness Areas>1 -- Rawah Wilderness Area 2 -- Neota Wilderness Area 3 -- Comanche Peak Wilderness Area 4 -- Cache la Poudre Wilderness Area

<Soil Types>1 to 40 : based on the USFS Ecological Landtype Units for this study area.

<Forest Cover Types>1 -- Spruce/Fir 2 -- Lodgepole Pine 3 -- Ponderosa Pine 4 -- Cottonwood/Willow 5 -- Aspen 6 -- Douglas-fir 7 -- Krummholz

Page 6: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 6

1. Data

Class Distribution

Number of records of Spruce-Fir: 1375 Number of records of Lodgepole Pine: 1462Number of records of Ponderosa Pine: 1262 Number of records of Cottonwood/Willow: 1620Number of records of Aspen: 1582 Number of records of Douglas-fir: 1349 Number of records of Krummholz: 1349 Number of records of other: 0

Total records: 9999

Page 7: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 7

2. Process Import 2 Tables

<Attr> Seq 와 53 개의 Measure Attribute

<Covtype>Seq 와 Predict 값인 cover

Page 8: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 8

2. Process Create TEST@ Mining Model

Page 9: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 9

3. Result Node Path: All

Page 10: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 10

3. Result Node Path: Elevation <= 2400.25

Page 11: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 11

3. Result Node Path: Elevation <= 2400.25 and Hillshade 9am <= 156.25

Page 12: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 12

3. Result Node Path: Elevation <= 2400.25 and Hillshade 9am <= 156.25 and Hor Dist To Fp > 57

9.25 and <= 1112.25

Page 13: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 13

3. Result Node Path: Elevation <= 2400.25 and Hillshade 9am <= 156.25 and Hor Dist To Fp <= 5

79.25 or > 1112.25

Page 14: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 14

3. Result Node Path: Elevation <= 2400.25 and Hillshade 9am > 219.25

Page 15: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 15

3. Result Node Path: Elevation <= 2400.25 and Hillshade 9am > 219.25

Page 16: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 16

3. Result Node Path: Elevation <= 2400.25 and Hillshade 9am > 219.25 and Soil3 = 0

Page 17: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 17

3. Result Node Path: Elevation <= 2400.25 and Ver Dist To Hydro <= 1.75 or > 131.25 and Hillsha

de 9am > 219.25 and Soil3 = 0

Page 18: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 18

3. Result Node Path: Elevation <= 2400.25 and Hor Dist To Hydro > 35.75 and <= 304.25 and Ver

Dist To Hydro <= 1.75 or > 131.25 and Hillshade 9am > 219.25 and Soil3 = 0

Page 19: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 19

3. Result Node Path: Elevation <= 2400.25 and Hor Dist To Hydro <= 35.75 or > 304.25 and Ver Di

st To Hydro <= 1.75 or > 131.25 and Hillshade 9am > 219.25 and Soil3 = 0

Page 20: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 20

3. Result Dependency Network Browse

Page 21: 2003_121 Install and Use Analysis Manager to create Decision Trees 과 목 Data Mining 교 수 용 환 승 작성자 김지영

2003_12 21

3. Result Dependency Network Browse