a classification approach for effective noninvasive diagnosis of coronary artery disease advisor:...

25
A Classification Approach for Effective Noninvasive Diagnosis of Coronary Artery Disease Advisor: 黃黃黃 黃黃 Student: 黃黃黃 D95402001 黃黃黃 D95402005 黃黃黃 D95402007

Post on 19-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

A Classification Approach for Effective Noninvasive Diagnosis

of Coronary Artery Disease

Advisor: 黃三益 教授Student: 李建祥 D95402001 楊宗憲 D95402005 張珀銀 D95402007

Outline

BackgroundMotivationThe Data Mining ProcessConclusionLimitation

Background

Heart disease is the leading cause of death in Taiwan. The third on the rank of the number of people

died The number of people died is 12,970 The death rate is 57.1% per one hundred

thousand people

Coronary artery disease (CAD) is the most common type of heart disease.

The illustration of CAD

Source: http://images.medicinenet.com/images/illustrations/heart_attack.jpg

The Diagnosis of CAD

Noninvasive approaches Laboratory tests Electrocardiogram (ECG) Ultrasound tests

Invasive approach Coronary angiography

The Important Risk Factors of CAD

SmokingHigh blood pressureHigh blood cholesterolDiabetesBeing overweight or obesePhysical inactivity

Motivation

Invasive approach is higher risk and cost than noninvasive approach

Is noninvasive approach is sufficient to diagnose the possibility of occurring CAD?

What’s the performance of noninvasive approaches?

Motivation

Invasive approach is higher risk and cost than noninvasive approach

Is noninvasive approach is sufficient to diagnose the possibility of occurring CAD?

What’s the performance of noninvasive approaches?

The Data Mining Procedure: Step 1

Use some medically examinations to predict whether some people have heart disease.

A Classification problem

The Data Mining Procedure: Step 2

UCI KDD archive webThose row data come from three hospitals

in United States

The Data Mining Procedure: Step 3

Attributes Range

age Min= 28Max= 77Average= 54

Sex Male=1Female=0

Chest pain type 1:Typical angina2:Atypical angina3:Non- angina pain4:Asymptomatic

resting blood pressure* [0,120] =0[121, ∞) =0

cholestoral * 0: >2001: [200,240)2: >240

fasting blood sugar 0: <=1201: >120

electrocardiographic 0:Normal1:Having ST-T wave abnormality2:Left ventricular hypertrophy

maximum heart rate [60,138]

Attributes Range

exercise induced

angina Yes = 1No = 0

ST depression induced by exercise

The slope of peak exercise ST segment

Upsloping = 1flat= 2Downsloping =3

resting blood pressure

>120 =1<120 =0

Number of major vessels

[0,1,2,3]

thal Normal = 3Fixed defect = 6Reversable defect = 7

diagonsis < 50% diameter narrowing = 0> 50% diameter narrowing = 1

The Data Mining Procedure: Step 4

Source: UCI KDD ArchiveTraining set: Cleveland Clinic Foundation,

303 recordsTesting set: Hungarian Institute of

Cardiology, 294 records

The Data Mining Procedure: Step 5

The Problem of data Missing Value

Approach Discard the records containing missing values

The Data Mining Procedure: Step 6

Skip this step due to the uselessness of aggregating records or combining original attributes

The Data Mining Procedure : Step 7

In order to obtain rules from models to support medical decision Decision tree C4.5 and Bays Network are used

as our data mining approaches WEKA is used as our data mining tool

The Data Mining Procedure : Step 7 (Cont.)

WEKA tools screenshot

The Data Mining Procedure : Step 8

In order to assess the models, we conduct two phrases experiments with comprehensive measures which include sensitivity, specificity and accuracy.

The Data Mining Procedure : Step 8 (Cont.)

First Phrase The diverse combinations of fields were used

as input variables. The fields are divided into four groups

The Data Mining Procedure : Step 8Model Assessment

• We use sensitivity, specificity and accuracy as our model measures.• Sensitivity could represent the probability of

mistake in diagnosis• Specificity could represent the probability of

unnecessary medical resource wasting

The Data Mining Procedure : Step 8The Result of First Phase

The Data Mining Procedure : Step 8 The Result of First Phase (Cont.)

The Data Mining Procedure : Step 8 The Result of Second Phrase

The Data Mining Procedure : Step 9

• In this step, due to we have not enough medical resource to support our project, it is difficult to deploy our models in practical. Although we cannot fulfill our models in the real business environment, we still obtain copious experience and knowledge throughout data mining process.

Conclusion

Two classification approaches: decision tree and Bayesian network.

Using noninvasive and invasive approaches step by step.

Limitation

Limited noninvasive approaches Apply other non-invasive examinations to

increase performance of data mining model, such as ultrasound tests.

Lack of explanation of rules with domain knowledge