頁尾文字 2015/9/18 頁尾文字 1 data mining for healthcare documents 陳啟煌...
TRANSCRIPT
![Page 1: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/1.jpg)
112/04/19頁尾文字1
Data Mining for Healthcare Documents
陳啟煌臺灣大學計資中心程式組
2011.10.27
![Page 3: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/3.jpg)
112/04/19頁尾文字3
Outlines
Introduction Biomedical Semantic Similarity Measure Semantic-driven Keyword Matching Extractor Web-based Discharge Summary System Healthcare Mining Project with Mongolia Conclusions and Future Works
![Page 4: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/4.jpg)
112/04/19頁尾文字4
Clinical Mining
Clinical Database
Clinical Pathways
![Page 5: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/5.jpg)
IntroductionIntroduction
In IOM 2000 report, 44,000 to 98,000 unIn IOM 2000 report, 44,000 to 98,000 unnecessary deaths per yearnecessary deaths per year– Death rate equivalent to three jumbo jets cra
shed every two days– Motor vehicle accidents: 43,458– breast cancer:42,297 – AIDS:16,516
![Page 6: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/6.jpg)
Suggested SolutionsSuggested Solutions
Development of IT infrastructuresDevelopment of IT infrastructures– Computerized Physician Order Entry (CPOE CPOE
))• Order Sets: to do the right thing easier.Order Sets: to do the right thing easier.• Alerts / remindersAlerts / reminders• Clinical guidelineClinical guideline
Restriction on working hoursRestriction on working hours Greater staffing to patient ratiosGreater staffing to patient ratios
![Page 7: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/7.jpg)
![Page 8: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/8.jpg)
112/04/19頁尾文字8
Motivation
Clinical Pathway– a way of treating a patient with a
standardized procedure in order to• Enhance the efficiency, • Increase the quality,• Lower the costs,• Shorten the length of stay in hospital.
Usually represented in a script book and/or flow chart diagram
![Page 9: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/9.jpg)
Order Sets System Evolution
Paper Order Sets– Predefined orders written on paper.
Electronic Order Sets– Just a UI to create and lookup order sets
Knowledge-based Order Sets– Machine Learning– Interactive UI to user.
![Page 10: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/10.jpg)
How to Create Order Sets
Committee– Traditional method, time-consuming
Feedback system– Interaction with users, suggestions
Data mining– Find patterns from existed clinical data
![Page 11: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/11.jpg)
![Page 12: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/12.jpg)
Raw Data
![Page 13: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/13.jpg)
112/04/19頁尾文字13
Introduction
![Page 14: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/14.jpg)
112/04/19頁尾文字14
Motivation
Free-Text Reports– Discharge summaries – Radiology reports– Pathology reports– Enclose treatments can be extracted,
learned, and gained knowledge
![Page 15: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/15.jpg)
112/04/19頁尾文字15
Motivation
Biomedical Semantic Similar Terms exists in medical reposts.
– “congestive heart failure”,”cardiac decompensation “, and “volume overload”
![Page 16: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/16.jpg)
112/04/19頁尾文字16
Approaches
Biomedical Semantic Similarity Measure– Calculate semantic similarity between terms
A Powerful Extractor– To view, verify, extract data items from reports
Structuralized – Providing Highly Interactive Editor
• Auto-complete• Model essay• User phrases
![Page 17: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/17.jpg)
112/04/19頁尾文字17
Biomedical Semantic Similarity Measure
![Page 18: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/18.jpg)
112/04/19頁尾文字18
Introduction(1/4)
Ontology-techniques– Ontology Tree
• Single ontology
• Cross ontology
– Path length, Edge counting
Corpus-based techniques– Context vector measure, Latent semantic
analysis (LSA)
![Page 19: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/19.jpg)
112/04/19頁尾文字19
Introduction(2/4)
The Web Corpus– The Web is providing unprecedented access
to the information as well as interacting with people’s daily lives.
– The idea of using the Web as a corpus for NLP research is getting popular.
![Page 20: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/20.jpg)
112/04/19頁尾文字20
Introduction(3/4)
How to analyze each document directly of the Web?
![Page 21: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/21.jpg)
112/04/19頁尾文字21
Introduction(4/4)
Web search engines– Efficient interface
– Numerous documents & high growth rate
– Google – page count
![Page 22: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/22.jpg)
112/04/19頁尾文字22
Background and Related Work
Ontology-techniques – Single ontology
• Edge counting
• Information content
• Feature based
• Hybrid
– Cross ontology• Hliaoutakis etc.
![Page 23: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/23.jpg)
112/04/19頁尾文字23
Methodologies
Sample Construction Feature Definitions Feature Selection Strategy Machine Learning Model
– Support Vector Machine Model
![Page 24: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/24.jpg)
112/04/19頁尾文字24
Sample Construction(1/3)
![Page 25: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/25.jpg)
112/04/19頁尾文字25
Sample Construction(2/3)
![Page 26: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/26.jpg)
112/04/19頁尾文字26
Sample Construction(3/3)
In our study, we collect– 1500 synonymous term pairs
– 1500 non-synonymous term pairs
![Page 27: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/27.jpg)
112/04/19頁尾文字27
Feature Definitions(1/4)
Features–Co-occurrence
• A
• a
• B
![Page 28: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/28.jpg)
112/04/19頁尾文字28
Feature Definitions(2/4)
Features–Co-occurrence
• A
–Semantic distance• A
![Page 29: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/29.jpg)
112/04/19頁尾文字29
Feature Definitions(3/4)
”Apoptosis known as programmed cell death” The phrase known as indicates a synonymous
relationship between the apoptosis and the programmed cell death.
”Apoptosis known as programmed cell death”– Google page count - 141
” Isoflavone known as Cyclooxygenase”– Google page count - 0
![Page 30: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/30.jpg)
112/04/19頁尾文字30
Feature Definitions(4/4)
Features– Lexico-syntactic pattern
• P known as Q H( P known as Q )/H( P ∩ Q )• of P (Q)• P (Q)• and P (Q• , P (Q• against P (Q• prevalence of P Q• patients with P Q • P/Q• P, Q
![Page 31: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/31.jpg)
112/04/19頁尾文字31
Feature Selection Strategy
Rank the features according to their ability to express synonymy by F-score:
![Page 32: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/32.jpg)
112/04/19頁尾文字32
Support Vector Machine Model(1/2)
![Page 33: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/33.jpg)
112/04/19頁尾文字33
Support Vector Machine Model(2/2)
LIBSVM 2.89– C-SVC
• Linear• Polynomial degree=2• Polynomial degree=3• RBF
– nu-SVC• Linear• Polynomial degree=2• Polynomial degree=3• RBF
![Page 34: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/34.jpg)
112/04/19頁尾文字34
Datasets(1/5) Concept 1 Concept 2 Human
Anemia Appendicitis 0.031
Dementia Atopic Dermatitis 0.062
Bacterial Pneumonia Malaria 0.156
OsteoporosisPatent Ductus Arteriosu
s0.156
Amino Acid Sequence Anti-Bacterial Agents 0.156
Acquired Immunodeficiency Syndrome
Congenital Heart Defects
0.062
Otitis Media Infantile Colic 0.156
Meningitis Tricuspid Atresia 0.031
Sinusitis Mental Retardation 0.031
Hypertension Kidney Failure 0.5
Hyperlipidemia Hyperkalemia 0.156
Hypothyroidism Hyperthyroidism 0.406
Sarcoidosis Tuberculosis 0.406
Vaccines Immunity 0.593
Asthma Pneumonia 0.375
Table 1: Dataset 1 of 36 medical term pairs
![Page 35: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/35.jpg)
112/04/19頁尾文字35
Datasets(2/5) Concept 1 Concept 2 Human
Diabetic Nephropathy Diabetes Mellitus 0.5
Lactose IntoleranceIrritable Bowel Syndro
me0.468
Urinary Tract Infection Pyelonephritis 0.656
Neonatal Jaundice Sepsis 0.187
Sickle Cell Anemia Iron Deficiency Anemia 0.437
Psychology Cognitive Science 0.593
Adenovirus Rotavirus 0.437
Migraine Headache 0.718
Myocardial Ischemia Myocardial Infarction 0.75
Hepatitis B Hepatitis C 0.562
Carcinoma Neoplasm 0.75
Pulmonary Valve Stenosis
Aortic Valve Stenosis 0.531
Failure To Thrive Malnutrition 0.625
Breast Feeding Lactation 0.843
Antibiotics Antibacterial Agents 0.937
Table 1: Dataset 1 of 36 medical term pairs
![Page 36: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/36.jpg)
112/04/19頁尾文字36
Datasets(3/5)
Concept 1 Concept 2 Human
Seizures Convulsions 0.843
Pain Ache 0.875
Malnutrition Nutritional Deficiency 0.875
Measles Rubeola 0.906
Chicken Pox Varicella 0.968
Down Syndrome Trisomy 21 0.875
Table 1: Dataset 1 of 36 medical term pairs
![Page 37: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/37.jpg)
112/04/19頁尾文字37
Datasets(4/5) Concept 1 Concept 2 Physician Expert
Renal Failure Kidney Failure 4 4
Heart Myocardium 3.3 3
Stroke Infarct 3 2.8
Abortion Miscarriage 3 3.3
Delusion Schizophrenia 3 2.2
Congestive Heart Failure
Pulmonary Edema 3 1.4
Metastasis Adenocarcinoma 2.7 1.8
Calcification Stenosis 2.7 2
Diarrhea Stomach Cramps 2.3 1.3
Mitral Stenosis Atrial Fibrillation 2.3 1.3
Chronic ObstructivePulmonary Disease
Lung Infiltrates 2.3 1.9
Rheumatoid Arthritis Lupus 2 1.1
Brain TumorIntracranial Hemorrhag
e2 1.3
Carpel Tunnel Syndrome
Osteoarthritis 2 1.1
Diabetes mellitus Hypertension 2 1
Table 2: Dataset 2 of 30 medical term pairs
![Page 38: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/38.jpg)
112/04/19頁尾文字38
Datasets(5/5) Concept 1 Concept 2 Physician Expert
Acne Syringe 1.7 1.2
Antibiotic Allergy 1.7 1
CortisoneTotal Knee Replacemen
t1.7 1.2
Pulmonary Embolus Myocardial Infarction 1.7 1.4
Pulmonary Fibrosis Lung Cancer 1.3 1
Cholangiocarcinoma Colonoscopy 1.3 1
Lymphoid Hyperplasia Laryngeal Cancer 1 1
Multiple sclerosis Psychosis 1 1
Appendicitis Osteoporosis 1 1
Rectal Polyp Aorta 1 1
Xerostomia Alcoholic Cirrhosis 1 1
Peptic Ulcer Disease Myopia 1 1
Depression Cellulites 1 1
Varicose Vein Entire Knee Meniscus 1 1
Hyperlidpidemia Metastasis 1 1
Table 2: Dataset 2 of 30 medical term pairs
![Page 39: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/39.jpg)
112/04/19頁尾文字39
Experiment Results
Rank Feature F(i)
1 NGD 0.2751
2 WebPMI 0.237
3 , X (Y 0.1648
4 X/Y 0.1632
5 X(Y) 0.1606
6 X, Y 0.1585
7 WebOverlap 0.1173
8 WebDice 0.0555
9 WebJaccard 0.0347
10 of X (Y) 0.0185
11 and X (Y 0.0093
12 against X (Y 0.0027
13 patients with X Y 0.0017
14 X known as Y 0.0014
15 prevalence of X Y 0.0011
![Page 40: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/40.jpg)
112/04/19頁尾文字40
Experiment Results
Figure 3.4(a): Correlation vs. No of features and training samples using C-SVC with linear kernel
![Page 41: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/41.jpg)
112/04/19頁尾文字41
Experiment Results
Figure 3.4(b): Correlation vs. No of features and training samples using C-SVC with polynomial degree=2 kernel
![Page 42: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/42.jpg)
112/04/19頁尾文字42
Experiment Results
Figure 3.4(c): Correlation vs. No of features and training samples using C-SVC with polynomial degree=3 kernel
![Page 43: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/43.jpg)
112/04/19頁尾文字43
Experiment Results
Figure 3.4(d): Correlation vs. No of features and training samples using C-SVC with RBF kernel
![Page 44: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/44.jpg)
112/04/19頁尾文字44
Experiment Results
Figure 3.5(a): Correlation vs. No of features and training samples using nu-SVC with linear kernel
![Page 45: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/45.jpg)
112/04/19頁尾文字45
Experiment Results
Figure 3.5(b): Correlation vs. No of features and training samples using nu-SVC with polynomial degree=2 kernel
![Page 46: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/46.jpg)
112/04/19頁尾文字46
Experiment Results
Figure 3.5(c): Correlation vs. No of features and training samples using nu-SVC with polynomial degree=3 kernel
![Page 47: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/47.jpg)
112/04/19頁尾文字47
Experiment Results
Figure 3.5(d): Correlation vs. No of features and training samples using nu-SVC with RBF kernel
![Page 48: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/48.jpg)
112/04/19頁尾文字48
Experiment Results
ModelMaximum correlation
Number of samplesNumber of features
C-SVC(Linear) 0.758 1500 9
C-SVC(Poly=2) 0.776 1200 7
C-SVC(Poly=3) 0.759 300 13
C-SVC(RBF) 0.612 1100 10
nu-SVC(Linear) 0.798 900 7
nu-SVC(Poly=2) 0.766 300 11
nu-SVC(Poly=3) 0.736 300 12
nu-SVC(RBF) 0.743 100 11
![Page 49: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/49.jpg)
112/04/19頁尾文字49
Experiment Results
Table 5: Correlation vs. Dataset 1 and Dataset 2 with physician scores and expert scores of differe
nt models
Model Dataset 1Dataset 2(Phy)
Dataset 2(Exp)
C-SVC(Linear) 0.758 0.689 0.482
C-SVC(Poly=2) 0.776 0.698 0.479
C-SVC(Poly=3) 0.759 0.649 0.395
C-SVC(RBF) 0.612 0.388 0.171
nu-SVC(Linear) 0.798 0.705 0.496
nu-SVC(Poly=2) 0.766 0.671 0.424
nu-SVC(Poly=3) 0.736 0.641 0.384
nu-SVC(RBF) 0.743 0.632 0.373
![Page 50: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/50.jpg)
112/04/1950
Result comparisonTable 3.4 Result comparison for Dataset 1
Measure Dataset 1
SemDist 0.726(2)
Path length 0.422(5)
Leacock & Chodorow
0.600 (3)
Wu & Palmer 0.498(4)
Proposed 0.798 (1)
![Page 51: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/51.jpg)
112/04/1951
Result comparisonTable 3.5: Results comparison for Dataset 2
MeasureDataset 2((Phys
ician)Dataset 2(EXPE
RT)
Path length 0.512(4) 0.731(2)
Leacock & Chodorow
0.358(7) 0.497(5)
Lin 0.522(3) 0.565(4)
Resnik 0.534(2) 0.61(3)
Jiang & Conrath
0.506(5) 0.741(1)
Vector(All sect, 1M notes)
0.436(6) 0.497(5)
Proposed 0.705(1) 0.496(6)
![Page 52: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/52.jpg)
112/04/19頁尾文字52
Semantic-driven Keyword Matching Extractor
![Page 53: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/53.jpg)
112/04/19頁尾文字53
Introduction
For Structuralized Clinical Data– Data can be directly exported for further anal
yzing and mining For Non-structuralized Clinical Data
– Data need to be further processed to extract the relevant information
![Page 54: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/54.jpg)
112/04/19頁尾文字54
Background and Related works
Marking concepts and related semantics– Cancer Text Information Extraction System (ca
TIES) Extracting data items fill the outcomes into t
he predefined template– IBM Watson Research Center & Mayo Clinic
Providing the verification user interface– Commercial natural language processing (NL
P) engines
![Page 55: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/55.jpg)
112/04/19頁尾文字55
Architecture
Apply match pattern on textual reports
Send matching profile
Review and verify matched information
Clinical data warehouse
Textualclinical reports
Matching metadata
Retrievekeyword list
Select keyword
Retrievematching profile
Store structuralized data
Case-oriented template schema
Keyword selection interface
Information matching modules
Textual documentsviewer
Extraction verification editor
![Page 56: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/56.jpg)
112/04/19頁尾文字56
Methodology
The default common keyword lists of each type of textual documents
the personal keyword lists – matching the keyword and the keywords with relate
d semantic – mapping the corresponding matching rules using th
e retrieved matching pattern and applying the matching rules on the textual reports
– Date, 2009/01/01, 12/01 – Size, “4.9 x 1 x 1.8” length x width x height
![Page 57: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/57.jpg)
112/04/19頁尾文字57
Result
![Page 58: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/58.jpg)
112/04/19頁尾文字58
Result
![Page 59: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/59.jpg)
112/04/19頁尾文字59
Discharge Summary System
![Page 60: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/60.jpg)
112/04/19頁尾文字60
Background
Old Discharge summary system(Dis32)– Client/Server Architecture – Install/upgrade client applications
Web Discharge summary system– Service-Oriented Architecture– 2009.10 Online
![Page 61: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/61.jpg)
112/04/19頁尾文字61
![Page 62: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/62.jpg)
112/04/19頁尾文字62
Motivation
Discharge summary user interface– Chief Complaint, Brief History – Free-Text field– How to generate a list of suggesting phrases
![Page 63: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/63.jpg)
112/04/19頁尾文字63
Motivation
Auto-Complete
![Page 64: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/64.jpg)
112/04/19頁尾文字64
Language Modeling
We want to compute P(w1,w2,w3,w4,w5…wn), the probability of a sequence
Alternatively we want to compute P(w5|w1,w2,w3,w4): the probability of a word given some previous words
The model that computes P(W) or P(wn|w1,w2…wn-1) is called the language model.
![Page 65: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/65.jpg)
112/04/19頁尾文字65
SRILM
SRILM– The SRI Language Modeling Toolkit – SRILM is a toolkit for building and applying s
tatistical language models (LMs)– http://www.speech.sri.com/projects/srilm/
![Page 66: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/66.jpg)
112/04/19頁尾文字66
SRILM
Three Main Functionalities – Generate the n-gram count file from the corpus – Train the language model from the n-gram count file – Calculate the test data perplexity using the trained la
nguage mode
![Page 67: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/67.jpg)
112/04/19頁尾文字67
Implementation
N-gram Count File– Chief Complaint, Brief History
Static– Phrase lists
Dynamic– AJAX + AutoComplete toolkit
![Page 68: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/68.jpg)
112/04/19頁尾文字68
Discharge notes
![Page 69: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/69.jpg)
112/04/19頁尾文字69
Results
System Name Time Spent
Client-server system 652 seconds
(00:10:52)
Web-based system 372 seconds
(00:06:12)
The average consumed time (Measure unit: seconds (hh:mm:ss)
7 intern participants
![Page 70: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/70.jpg)
112/04/19頁尾文字70
Healthcare Mining Project with Mongolia
![Page 71: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/71.jpg)
112/04/19頁尾文字71
Background
Taiwan — Mongolia– National Science Council– Mongolian Ministry of Education, Culture an
d Sciences NTU — MUST
– Mongolian University of Science and Technology
3-Year Project– 2009/8/1 – 2012/7/31
![Page 72: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/72.jpg)
112/04/19頁尾文字72
Motivation
Reduce cost– Length of stay in hospital – Early detection of disease
Improve quality and patient safety– SOP, Clinical Pathways
![Page 73: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/73.jpg)
112/04/19頁尾文字73
Motivation
Clinical Pathway– a way of treating a patient with a standardize
d procedure in order to• Enhance the efficiency, • Increase the quality,• Lower the costs,• Shorten the length of stay in hospital.
Usually represented in a script book and/or flow chart diagram
![Page 74: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/74.jpg)
112/04/19頁尾文字74
Project Goal
Build A Data Mining framework for– Early detection of disease
• Find out the sequential patterns between different diseases
– Standardized therapeutic procedure • Discover clinical pathways and clinical guide
![Page 75: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/75.jpg)
112/04/19頁尾文字75
Mining Clinical Pathway
Clinical Database
Clinical Pathways
![Page 76: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/76.jpg)
112/04/19頁尾文字76
Clinical Data
The clinical data include– Patient information,– Diagnosis– Sequences of physicians orders taken at diff
erent time moments.
![Page 77: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/77.jpg)
112/04/19頁尾文字77
![Page 78: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/78.jpg)
112/04/19頁尾文字78
Clinical Sequence Mining system diagram
DataPreparation
Data Pre-Processing
Mining Model
HistoricalDiagnosisDatabase
OrdersSequenceKnowledge
base Alert and Reminding
System
Clinical Pathway Creation System
![Page 79: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/79.jpg)
112/04/19頁尾文字79
Data Preparation
Inpatient Department raw data– From 2007/1/1 to 2007/5/26
Discharge notes– with admission/discharge diagnosis, chief co
mplaint. 22,000 records Diagnosis records in IPD
– with ICD9 code Related orders in IPD
![Page 80: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/80.jpg)
112/04/19頁尾文字80
Data Preparation
Chief complaint– For scheduled chemotherapy– Total
• 791 cases• 33,771 physician orders
![Page 81: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/81.jpg)
112/04/19頁尾文字81
Data Pre-processing
Select relevant data according to the order type attribute– Drop some non-meaningful orders such as n
ursing care, Administration routine orders.
![Page 82: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/82.jpg)
112/04/19頁尾文字82
Order Type Statistics
ordertypecode cnt ordercntR 10309 368
T 6180 135
A 6063 20
L 5569 175
M 4026 84
D 814 25
X 360 41
B 168 5
O 106 58
J 47 6
E 40 14
P 12 4
I 11 3
N 6 2
![Page 83: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/83.jpg)
112/04/19頁尾文字83
Mining Model
Sequence Clustering Algorithm Mining Tool
– Microsoft SQL Server 2005– Sequence Clustering Model– Visualize Data Analysis
Parameter– Support– Confidence
![Page 84: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/84.jpg)
112/04/19頁尾文字84
Sequence Clustering Mining
Sequence Clustering algorithm finds clusters of cases that contain similar paths in a sequence.
![Page 85: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/85.jpg)
112/04/19頁尾文字85
Sequence Clustering Sample
CustomID (Sequence Data)1 (30) (60 90)
2 (10 20) (30) (40 60 70)
3 (30 50 70)
4 (30) (40 70) (90)
5 (90)
Sequential Pattern :
(30) (90) 、 (30) (40 70)
![Page 86: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/86.jpg)
112/04/19頁尾文字86
Mapping
Custom Patient Item Order Shopping Cart Concurrent Orders
![Page 87: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/87.jpg)
Result
![Page 88: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/88.jpg)
112/04/19頁尾文字88
![Page 89: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/89.jpg)
112/04/19頁尾文字89
![Page 90: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/90.jpg)
112/04/19頁尾文字90
![Page 91: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/91.jpg)
112/04/19頁尾文字91
![Page 92: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/92.jpg)
112/04/19頁尾文字92
Sequence Sample
09029CZP Bilirubin, total
08011CZP CBC & platelet
08013CZP WBC differential count
09015CZP (Blood)Creatinine
09002CZP (Blood)UN
09025CZP AST(GOT)
09026CZP ALT(GPT)
09038CZP Albumin(Blood)
09021CZP Sodium, Na
09022CZP Potassium, K
血小板
白血球
肌酸酐
膽紅素
肝功能指數
肝功能指數
清蛋白
鈉
鉀
![Page 93: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/93.jpg)
112/04/19頁尾文字93
The SAGE Guideline Model
Standards-Based Sharable Active Guideline Environment– Developed by
• Stanford Medical Informatics, IDX Systems Corporation, Apelon Inc., Intermountain Health Care, Mayo Clinic and University of Nebraska Medical Center
![Page 94: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/94.jpg)
The Protégé
![Page 95: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/95.jpg)
112/04/19頁尾文字95
Activity Graphs
Aspirin Therapy for diabetic patients
![Page 96: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/96.jpg)
112/04/19頁尾文字96
![Page 97: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/97.jpg)
112/04/19頁尾文字97
Cooperation Architecture
Hospital in Mongolia
VM-DB VM-Web VM-DB VM-Web
Hospital in Taiwan
VM Images
Model Feedback
![Page 98: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/98.jpg)
112/04/19頁尾文字98
Cloud Architecture
Health Mining Server
Hospital in Taiwan
Hospital in Mongolia
Hospital in Canada
![Page 99: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/99.jpg)
112/04/19頁尾文字99
Conclusions
A measure that uses page counts calculate semantic similarity between two given concepts.
A semantic-driven keyword matching extractor help extract data item from reports
![Page 100: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/100.jpg)
112/04/19頁尾文字100
Conclusions
A highly Interactive free-text editor with auto-complete feature speed up the composition of discharge summaries.
A Data mining framework is proposed.
![Page 101: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/101.jpg)
112/04/19頁尾文字101
Future Works
Find out why corpus-based methods produce closer correlation with physicians’ scores than experts’
Structuralized the healthcare documents Prove Data mining models’ robustness
– Variation analysis across hospitals/regions– Taiwan and Mongolia– Canada , Taiwan and Mongolia
![Page 102: 頁尾文字 2015/9/18 頁尾文字 1 Data Mining for Healthcare Documents 陳啟煌 臺灣大學計資中心程式組 2011.10.27](https://reader031.vdocuments.pub/reader031/viewer/2022013115/56649e595503460f94b52d43/html5/thumbnails/102.jpg)
112/04/19頁尾文字102
Q&A