a classification approach for effective noninvasive diagnosis of coronary artery disease advisor:...
Post on 19-Dec-2015
223 views
TRANSCRIPT
A Classification Approach for Effective Noninvasive Diagnosis
of Coronary Artery Disease
Advisor: 黃三益 教授Student: 李建祥 D95402001 楊宗憲 D95402005 張珀銀 D95402007
Background
Heart disease is the leading cause of death in Taiwan. The third on the rank of the number of people
died The number of people died is 12,970 The death rate is 57.1% per one hundred
thousand people
Coronary artery disease (CAD) is the most common type of heart disease.
The Diagnosis of CAD
Noninvasive approaches Laboratory tests Electrocardiogram (ECG) Ultrasound tests
Invasive approach Coronary angiography
The Important Risk Factors of CAD
SmokingHigh blood pressureHigh blood cholesterolDiabetesBeing overweight or obesePhysical inactivity
Motivation
Invasive approach is higher risk and cost than noninvasive approach
Is noninvasive approach is sufficient to diagnose the possibility of occurring CAD?
What’s the performance of noninvasive approaches?
Motivation
Invasive approach is higher risk and cost than noninvasive approach
Is noninvasive approach is sufficient to diagnose the possibility of occurring CAD?
What’s the performance of noninvasive approaches?
The Data Mining Procedure: Step 1
Use some medically examinations to predict whether some people have heart disease.
A Classification problem
The Data Mining Procedure: Step 2
UCI KDD archive webThose row data come from three hospitals
in United States
The Data Mining Procedure: Step 3
Attributes Range
age Min= 28Max= 77Average= 54
Sex Male=1Female=0
Chest pain type 1:Typical angina2:Atypical angina3:Non- angina pain4:Asymptomatic
resting blood pressure* [0,120] =0[121, ∞) =0
cholestoral * 0: >2001: [200,240)2: >240
fasting blood sugar 0: <=1201: >120
electrocardiographic 0:Normal1:Having ST-T wave abnormality2:Left ventricular hypertrophy
maximum heart rate [60,138]
Attributes Range
exercise induced
angina Yes = 1No = 0
ST depression induced by exercise
The slope of peak exercise ST segment
Upsloping = 1flat= 2Downsloping =3
resting blood pressure
>120 =1<120 =0
Number of major vessels
[0,1,2,3]
thal Normal = 3Fixed defect = 6Reversable defect = 7
diagonsis < 50% diameter narrowing = 0> 50% diameter narrowing = 1
The Data Mining Procedure: Step 4
Source: UCI KDD ArchiveTraining set: Cleveland Clinic Foundation,
303 recordsTesting set: Hungarian Institute of
Cardiology, 294 records
The Data Mining Procedure: Step 5
The Problem of data Missing Value
Approach Discard the records containing missing values
The Data Mining Procedure: Step 6
Skip this step due to the uselessness of aggregating records or combining original attributes
The Data Mining Procedure : Step 7
In order to obtain rules from models to support medical decision Decision tree C4.5 and Bays Network are used
as our data mining approaches WEKA is used as our data mining tool
The Data Mining Procedure : Step 8
In order to assess the models, we conduct two phrases experiments with comprehensive measures which include sensitivity, specificity and accuracy.
The Data Mining Procedure : Step 8 (Cont.)
First Phrase The diverse combinations of fields were used
as input variables. The fields are divided into four groups
The Data Mining Procedure : Step 8Model Assessment
• We use sensitivity, specificity and accuracy as our model measures.• Sensitivity could represent the probability of
mistake in diagnosis• Specificity could represent the probability of
unnecessary medical resource wasting
The Data Mining Procedure : Step 9
• In this step, due to we have not enough medical resource to support our project, it is difficult to deploy our models in practical. Although we cannot fulfill our models in the real business environment, we still obtain copious experience and knowledge throughout data mining process.
Conclusion
Two classification approaches: decision tree and Bayesian network.
Using noninvasive and invasive approaches step by step.