chapter 6 knowledge acquisition 知識擷取
DESCRIPTION
Chapter 6 Knowledge Acquisition 知識擷取. 6.1 簡介. 知識擷取 (Knowledge Acquisition) 的主要目的是抽取領域專家的專業知識. Expertise Transfer. 知識庫. Computerized Representation. 專家. 系統採用知識擷取技術的優點 :. 不須依賴訓練範例 (training cases) 可以即時分析 可以即時地做一致性檢查 可以整合其他 KE 工具 知識庫可以自動產生. 先前研究回顧. Substantive( 獨立的、實在的 ) Knowledge : - PowerPoint PPT PresentationTRANSCRIPT
Chapter 6
Knowledge Acquisition知識擷取
Expert Systems sstseng 2
6.1 簡介• 知識擷取 (Knowledge Acquisition) 的主要目的是抽取領域專家的專業知識
知識庫
ComputerizedRepresentation專家
Expertise Transfer
Expert Systems sstseng 3
系統採用知識擷取技術的優點:
1. 不須依賴訓練範例 (training cases)
2. 可以即時分析
3. 可以即時地做一致性檢查
4. 可以整合其他 KE 工具
5. 知識庫可以自動產生
Expert Systems sstseng 4
先前研究回顧
Substantive( 獨立的、實在的 ) Knowledge :確認目前的狀態
“ 我目前是否處於被攻擊的危險中”
策略知識 (Strategic Knowledge) :決定下一步做什麼
“ 攀爬到 3000 英尺處”
Expert Systems sstseng 5
Repertory Grid Approach
知識擷取系統
SubstantiveKnowledge
Strategic Knowledge
ClassificationDecision making
ControlPlanning
MORE SALT MOLE ASK
OtherApproach
AQUINASKITTEN KNACK RuleCon
KRITON
TEIRESIAS
ETS NeoETS KSSO
Expert Systems sstseng 6
The Acquisition of Substantive Knowledge
• Repertory Grid (知識表格) -Oriented Methods :步驟一 : 抽取出要被分類的元素 (elements)
步驟二 : 由專家擷取出配對屬性組 (constructs)
每次取出三個元素 , 專家必須決定一個配對屬性組 , 區別出其中兩個元素與另一個元素的差別
步驟三 : 填入表格中 [ 元素 , 屬性 ] 的等級 , 由 1~5
步驟四 : 從知識表格產生推論圖 (Implication graph)
Expert Systems sstseng 7
步驟一 : 抽取出要被分類的元素
步驟二 : 由專家擷取出配對屬性組
Measles German Dangue Chickenpox Smallpox Measles Fever
Measles German Dangue Chickenpox Smallpox Measles Fever
1 5 1
5 1 1
2 5 5
2 4
high fever
red
purple
headache
no high
fever
not red
no purple
no
headache
Expert Systems sstseng 8
步驟三 : 填入表格中 [ 元素 , 屬性 ] 的等級
步驟四 : 從知識表格產生推論圖
Measles German Dangue Chickenpox Smallpox Measles Fever
1 5 1 2 3
1 5 1 1 2
2 5 2 5 5
5 4 2 2 4
high fever
red
purple
headache
no high
fever
not red
no purple
no
headache
headache
purple high fever
red
Expert Systems sstseng 9
由表格產生出來的規則 :
First column :IF high_fever and red and purple and (not headache)Then
Disease = Measles CF = MIN (0.8,1.0,0.8,0.8) = 0.8Second column : IF (not high_fever) and (not red) and (not purple) and (not headache) Then Disease = German Measles
Expert Systems sstseng 10
使用知識表格的好處
容易分析擷取出來的知識:
1. 屬性配對組的相似性分析
2. 元素的相似性分析
3. 分析不同屬性配對組的關聯
4. 偵測遺漏的元素
5. 偵測邏輯上的錯誤
Expert Systems sstseng 11
6.2 ELICTATION( 引出、誘出 ) OF SUBSTANTIVE KNOWLEDGE 知識表示法 (Knowledge Representation)
dog bird fish
4-legs
2-legs
no-legs
1 5 5
5 1 5
5 5 1
not 4-legs
not 2-legs
has-legs
dog bird fish
# of
legs
4,2 2,2 0,2
A dog has 4 legs being very sure
Expert Systems sstseng 12
An acquisition table is a repertory grid (知識表格) of multiple data types :
Boolean : true or false
Single value : an integer, a real, or a symbol
Set of value : a set of integers, real numbers or symbols.
Range of values : a set of integers or real numbers.
‘X’ : no relation.
‘U’ : unknown or undecidable.
Ratings :2 : very likely to be.
1 : maybe.
Expert Systems sstseng 13
6.3 知識表格可能的問題
元素選擇的問題
E1 E2 E3 E4 E5
C1
C2
C3
C4
1 5 5 4 2
1 5 1 1 5
1 5 1 2 2
1 5 1 1 4
C’1
C’2
C’3
C’4
Expert Systems sstseng 14
Problem of Multi-Level Knowledge and Acquirability
INPUT DATA INPUT DATA
SUBGOAL
SUBGOAL INPUT DATA
SUBGOAL
GOAL
Expert Systems sstseng 15
• The Concept of Acquirability :
The value of a terminal attribute of a decision tree must eithe
r
be a constant or be acquirable from users. For example :IF
(leaf-shape = scale( 鱗狀 )) and
(class = Gymnosperm( 裸子植物 ))
THEN
family = Cypress( 柏樹 ).
Class is not an acquirable attribute.
Expert Systems sstseng 16
Leaf ShapeClass
Family
? ? ?
Expert Systems sstseng 17
Domain basis and classification knowledge :
Domain basis
Other diseases
Acute Exanthemas
Classificationknowledge
Measles, German measles, Dangue fever,…
Diseases
( 劇烈的疹病 )
Expert Systems sstseng 18
隱含知識的問題• 當一個診斷者在描述感冒有下列特徵
“ 頭痛 , 疲勞 , 咳嗽 , 打噴嚏 ,…,”
他的真正意思是 “當一個人真正感冒時 , 他可能會有上述幾種症狀”
• 一般我們常用以下的規則來表示 :
(Headache = yes) and (Feel_tired = yes) and
(cough = yes) and …,
--> Disease = Catch_cold
Expert Systems sstseng 19
• 診斷者的隱含知識
“ 假如沒有一個或數個感冒的症狀 , 這個病人仍然有可能感冒”
這樣的隱含知識被忽略了
Expert Systems sstseng 20
6.4 EMCUD :一個新的隱含知識擷取技術
知識表示法 (Knowledge Representation) :
知識擷取 (Conventional Repertory grid) 或 Acq
uisition Table
+
屬性序列表格 (Attribute Ordering Table - AOT)
Expert Systems sstseng 21
根據屬性序列表格擷取隱含知識
• 每一個 AOT 的值可能是:
‘D’ :該屬性對目標有主導權
‘X’ :該屬性與目標無關
整數:屬性相對於目標的重要程度順序 ( 越小的數值越不重要 )
Obj1 Obj2 Obj3 Obj4 Obj5
A1
A2
A3
D D 2 1 D
1 1 1 D D
X X D 1 D
Expert Systems sstseng 22
知識表格的範例
根據第一個欄位產生出來的規則:RULE1 : (A1{9,10,12}) (A2 = YES) > GOAL=obj1
Where
F(confidence) = 1.0 if confidence = 2
= 0.8 if confidence = 1
and
Certainty Factor CF = MIN(F(2),F(1)) = 0.8
Obj1 Obj2 Obj3 Obj4 Obj5
A1
A2
A3
{9,10,12},2 20,2 (13-16],2 17,2 3,2
YES,1 NO,2 YES,1 YES,2 NO,2
X X 4.3,2 2.1,2 6.0,2
Expert Systems sstseng 23
產生 AOT 的範例
EMCUD : If A1 {9,10,12}, is it possible that GOAL =Obj1 ?
EXPERT : No. /*This implies that A1 dominates Obj1 and
AOT<Obj1,A1> = ‘D’ */
EMCUD : If A2 YES,is it possible that GOAL = Obj1?
EXPERT : Yes. /*A2 does not dominate Obj1 */
EMCUD : If A1 > 16 or A1 13, is it possible that GOAL = Obj3?
EXPERT : Yes. /* A1 does not dominate Obj3 */
EMCUD : If A2 YES, is it possible that GOAL = Obj3 ?
EXPERT : Yes. /* A2 does not dominate Obj3 */
EMCUD : If A3 4.3 , is it possible that GOAL = Obj3 ?
EXPERT : No. /* A3 does dominate Obj3 */
Expert Systems sstseng 24
EMCUD : Please rank A1 and A2 in the order of importance to
Obj3 by choosing one of the following expressions :
1)A1 is more important that A2
2)A1 is less important that A2
3)A1 is as important as A2
EXPERT : 1 /* A1 is more important to Obj3 than A2, hence
AOT < Obj3,A1> = 2 and AOT <Obj3,A2> = 1 */
Obj1 Obj2 Obj3 Obj4 Obj5
A1
A2
A3
D D 2 1 D
1 1 1 D D
X X D 1 D
Expert Systems sstseng 25
擷取隱含知識
From RULE3, the following embedded rules (隱含規則) will
Be generated by negating the predicates of A1 and A2 :RULE3,1 : NOT(13<A116)(A2=YES) (A3=A3)
→ GOAL = Obj3
RULE3,2 : (13<A116)NOT(A2=YES) (A3=A3)
→ GOAL = Obj3
RULE3,3 : NOT(13<A116)NOT(A2=YES) (A3=A3)
→ GOAL = Obj3
Expert Systems sstseng 26
Certainty Sequence(CS) :Represents the drgree of certainty degradation.
CS(RULESij) = SUM(AOT<Obji,Ak>)
for each ak in the negated predicates of ruleij
For example :CS(RULE3,3) = AOT < Obj3,A1 + AOT<Obj3,A2>
= 2 + 1 = 3
The embedded rules (隱含規則) generated from RULE3 :RULE3,1 : NOT(13<A116)(A2=YES) (A3=A3)
→ GOAL = Obj3 CS = 2
RULE3,2 : (13<A116)NOT(A2=YES) (A3=A3)
→ GOAL = Obj3 CS = 1
RULE3,3 : NOT(13<A116)NOT(A2=YES) (A3=A3)
→ GOAL = Obj3 CS = 3
Expert Systems sstseng 27
Construct Constraint List
1. Sort the embedded rules according to the CS values : RULES3,2 CS = 1
RULES3,1 CS = 2
RULES3,3 CS = 3
2. A prune-and-search algorithm : EMCUD : Do you think RULE3,1 is acceptable?
Expert : Yes. /* then RULE3,2 is also accepted*/
EMCUD : Do you think RULE3,3 is acceptable?
Expert : No. /* then CS=3 is recorded in the
constraint list */
Expert Systems sstseng 28
計算確定因子 (Certainty Factors)
Confirm : 1.0
Strongly support : 0.8
Support : 0.6
May support : 0.4
CFij= Upper-Boundi- (Csij/MAX(Csi))
(Upper-Boundi – Lower-Boundi)
MAX(Csi) : maximum CS value of the embedded
rules generated from RULEi.
Upper-Boundi : certainty factor of embedded
Lower-Boundi : certainty factor of embedded
rule with MAX(Csi) /* The rule with least confidence*/
Expert Systems sstseng 29
一個計算確定因子的例子
針對由 RULE3 得來的隱含規則:
1. Upper – Bound = CF(RULES3) = 0.8
2. 因為 RULES3 沒有被接受 , 所以擁有最大確定因子 (MAX(CS)) 的是
RULE3,1 :
EMCUD : If RULE3 strongly supports GOAL = Obj3 ,
what about RULE3,1 ?
Expert : 1. /*The Lower-Bound = 0.6*/
CF3,1 = 0.8 – (2/2) * (0.8 – 0.6) = 0.6
CF3,2 = 0.8 – (1/2) * (0.8 – 0.6) = 0.7
Expert Systems sstseng 30
• 擷取隱含知識的流程:repertory grid
Attribute-OrderingTable
Constraint List
mapping function
original rules
possible embedded rules
acceptedembedded rules
certainty factorsof
the embedded rules
elicitingembedded
rules
thresholding
mapping
Expert Systems sstseng 31
ACQUISITION TABLE
肺 炎咳 嗽疲 倦頭 痛
YES
YES
YES
肺 炎咳 嗽疲 倦頭 痛
YES,2
YES,2
YES,1
AOT
Expert Systems sstseng 32
傳統的知識表格:IF ( 咳嗽 =YES)&( 疲倦 =YES)&( 頭痛 =YES)
THEN DISEASE= 肺炎EMCUD :IF ( 咳嗽 =YES)&( 疲倦 <>YES)&( 頭痛 =YES)
THEN DISEASE= 肺炎 CF=0.67
IF ( 咳嗽 =YES)&( 疲倦 =YES)&( 頭痛 <>YES)
THEN DISEASE= 肺炎 CF=0.73
IF ( 咳嗽 =YES)&( 疲倦 <>YES)&( 頭痛 <>YES)
THEN DISEASE= 肺炎 CF=0.6
Expert Systems sstseng 33
OBJECT CHAIN : A METHOD FOR questions selection :
• For the grid with 50 elements (or objects), there are 19600 po
ssible choices of questions to elicit constructs (or attributes).
• Initial repertory grid (知識表格) and the object chains :OBJECT CHAIN
Obj1 --> 2,3,4,5
Obj2 --> 1,3,4,5
Obj3 --> 1,2,4,5
Obj4 --> 1,2,3,5
Obj5 --> 1,2,3,4
Obj1 Obj2 Obj3 Obj4 Obj5
Expert Systems sstseng 34
• The expert gives attribute P1 to distinguish Obj1 and
Obj2 from Obj3
OBJECT CHAIN Obj1 -- > 2,5
Obj2 -- > 1,5
Obj3 -- > 4
Obj4 -- > 3
Obj5 -- > 1,2
Obj1 Obj2 Obj3 Obj4 Obj5
P1 T T F F T
Expert Systems sstseng 35
• The expert gives attribute P2 to distinguish Obj2 and
Obj5 from Obj1
OBJECT CHAIN Obj1 -- > NULL
Obj2 -- > 5
Obj3 -- > NULL
Obj4 -- > NULL
Obj5 -- > 2
Obj1 Obj2 Obj3 Obj4 Obj5
P1
P2
T T F F T T F T F F
Expert Systems sstseng 36
• The expert gives attribute P3 to distinguish Obj2
from Obj5
OBJECT CHAIN Obj1 -- > NULL
Obj2 -- > NULL
Obj3 -- > NULL
Obj4 -- > NULL
Obj5 -- > NULL
Obj1 Obj2 Obj3 Obj4 Obj5
P1
P2
P3
T T T F T T F T F F F T T F F
Expert Systems sstseng 37
• Advantages :
1. Fewer questions are asked(log2n to n-1 questions).
2. All of the objects are classified.
3. Every question matches the current requirement of
classifying objects.
• Disadvantages :1. It may force the expert to think a specific direction.
2. Some important attributes may be ignored.
Expert Systems sstseng 38
Eliciting hierarchy of grids :• For the expert system (專家系統) of classifying families
of plants
Cypress Pine Bald Cypress Magnolia
柏樹 松樹 無葉柏樹 木蘭花Leaf shape
Needle pat. Class
Silver band
scale needle needle scale
X {random,evenline} evenline X
Gymnosperm Gymnosperm Gymnosperm Magnolia
X T F X
Goal is FAMILY
Expert Systems sstseng 39
• Since class is not acquirable, it becomes the goal of a new grid.
Gymnosperm Magnolia Angiosperm
裸子植物 木蘭科 被子植物type
flate
Tree Herb( 草本 ) Tree
F T T
Goal is CLASS
Expert Systems sstseng 40
• Since class is not acquirable, it becomes the goal of a new grid.
Herb Vine Tree Shrub
stem
position
one trunk
green woody woody woody X creeping upright upright
F T T F
Goal is TYPE
Expert Systems sstseng 41
Decision tree of the hierarchy of grids :
FAMILY OF PLANT
LEAF SHAPE NIDDLE PATTERN CLASS
TYPE FLATE
STEAM POSITION ONE TRUNK
Expert Systems sstseng 42
6.5 EMCUD 的應用和效能評估
應用領域:急性疹病的診斷
硬體:個人電腦
軟體:Personal Consultant Easy
Expert Systems sstseng 43
The codes of diseases and their translations:1-Measles 8 - Meningococcemia2-German measles 9 - Rocky Mt. Spotted fever 3-Chickenpox 10 - Typhus fevers4-Smallpox 11 – Infectious mononucleosis 5-Scarlet 12 – Enterovirus infections6-Exanthem subitum 13 – Drug eruptions 7-Fifth disease 14 – Eczema herpeticum
Table 6.3 : Testing results of the old and new prototypes.
case number 1 2 3 4 5 6 7 8 9 10 11 12 13
physician( 醫師 )
12 3 3 1 2 1 14 2 6 5 5 3 1
old prototype 12 X X X X X 14 X 6 X X 3 1
new prototype 12 3 3 1 2 1 14 2 6 5 5 3 1
case number 14 15 16 17 18 19 20 21 22 23 24 25
physician 6 6 12 5 8 9 14 13 4 1 2 14
old prototype X X 12 5 X 9 14 13 4 1 2 14
new prototype 6 6 12 5 8 9 14 13 4 1 2 14
Expert Systems sstseng 44
6.6 多專家知識整合
為了建立一個可靠的專家系統 , 通常我們需要多個專家通力合作
困難點:• Synonyms of elements (possible solutions)
• Synonyms of traits (attributes to classify the solutions)
• Conflicts of ratings
Expert Systems sstseng 45
Integrated Knowledge
Use more attributes to make choices
from more possible decisions
Habitual domain of Expert 1
Each expert has his own way to do some works.
Habitual domain of Expert 2
Expert Systems sstseng 46
Expert 1 Expert 2 Expert N
Busy Busy Busy
Far awayFar away
KnowledgeEngineer
It is difficult to have all of the experts work together
Expert Systems sstseng 47
Expert 1 Expert 2 Expert N… Phase 1 interview
Repertory Grid 1 Repertory Grid 2 Repertory Grid N
The unions of element sets and construct sets
Common Repertory GridPhase 2 interview
Expert 1 Expert 2 Expert N…
Eliminate some redundant vocabularies
Common Repertory Grid
Expert Systems sstseng 48
Expert 1 Expert 2 Expert N… Phase 3 interview
Rated CommonRepertory Grid 1
Rated CommonRepertory Grid 2
Rated CommonRepertory Grid N
Knowledge Integration
Integrated Repertory Grid
Rule Generation
Expert Systems sstseng 49
Repertory Grid 1 Repertory Grid 2 Repertory Grid N
The unions of element sets and construct sets
Common Repertory GridPhase 2 interview
Expert 1 Expert 2 Expert N…
Eliminate some redundant vocabularies
Common Repertory Grid
Expert 1 Expert 2 Expert N
Phase 3 interview
Expert Systems sstseng 50
Rated CommonRepertory Grid 1
Rated CommonRepertory Grid 2
Rated CommonRepertory Grid N
Knowledge Integration
Integrated Repertory Grid
Generate AOT
Flat Repertory Grid
AOT
Filled AOT 2Filled AOT 1 Filled AOT N…
Integrated AOT
Rule Generation
Integration or AOT’s
Expert Systems sstseng 51
Expert 1 Expert 2
5 4 1 4 5
1 1 5 1 1
4 4 5 3 1
5 5 5 4 3
4 1 1 5 4
4 1 1 5 5
5 1 1 5 4
1 4 5 1 1
5 2 2 5 5
5 1 4 1 1
Eye pain
Pupil sizeheadacheCornea
Inflame of Eye
Tears RednessVision
Papillary light response
Both Side
5 3 1 5 4
1 2 4 1 1
3 4 5 2 1
5 5 5 3 2
5 1 1 5 4
4 1 1 4 5
5 1 1 5 5
1 3 4 1 1
5 2 1 5 5
5 1 3 1 1
Eye pain
Pupil sizeheadacheCornea
Inflame of Eye
Tears RednessVision
Papillary light response
Both Side
E1 E2 E3 E4 E5 E1 E2 E3 E4 E5
Knowledge Integration
Expert Systems sstseng 52
Expert 3
5 4 1 5 5
1 1 5 1 1
4 4 5 2 1
5 5 5 4 2
5 1 1 5 4
4 1 1 5 5
5 1 1 5 5
1 4 5 1 1
5 2 1 5 5
5 1 4 1 1
Eye pain
Pupil sizeheadacheCornea
Inflame of Eye
Tears RednessVision
Papillary light response
Both Side
E1 E2 E3 E4 E5
Expert Systems sstseng 53
Results of the first experiment
Differential Diagnosis for Common Causes of Inflamed Eyes.
60 test cases are used to evaluate the knowledge base from
Expert 1, the knowledge base from Expert 2, and the
integrated knowledge base.
Knowledge base
Ratio of Correct Diagnosis
Expert 1
Expert 2
Integrated
0.67
0.64
0.8
Expert Systems sstseng 54
Results of the first experiment
Differential Diagnosis for Common Causes of Inflamed Eyes.
336 test cases are used to evaluate the knowledge base from
Expert 1, the knowledge base from Expert 2, and the
integrated knowledge base.
Knowledge base
Number of Correct
Diagnosis
Ratio of Correct
Diagnosis
Expert 1
Expert 2
Integrated
255
243
306
0.759
0.723
0.911
Expert Systems sstseng 55
6.7 機器學習 (Machine Learning)
建立一電腦程式 , 可以從訓練範例中獲取新的知識或是改進既有的知識
應用:Expert Systems
Cognitive( 認知 ) Simulation
Problem Solving
Control …
範例: Perceptron [Rosenblatt, 1961]
Meta-Dendral [Bucmanan, Feigenbaum, Sridharan, 1972]
AM [Lenat, 1976] LEX… [Mitchell, Utgoff, Banerji, 1983]
Expert Systems sstseng 56
[Michalski, 1983]
Learning
Learning by Analog
Rote Learning
Learning by Instruction
Learning by Induction
Learning from Observation and
Discovery
Learning fromExamples
Expert Systems sstseng 57
Machine Learning (機器學習) Central to A.I.
Learning from Examples.
Expert Systems sstseng 58
111
11
11
2 2 22 2
2 22 2
2 2 2 2 2
22
11 1
111 1
1
33
3
3
3
3
Expert Systems sstseng 59
LearningStrategies
NeuralLearning
SymbolicLearning
IncrementalLearning
BatchLearning
e.g.VersionSpace
e.g.ID3
e.g.PRISM
e.g.Perceptron
Expert Systems sstseng 60
資料導向學習策略的回顧[T.M. Mitchell 1979]
1. Depth-first search
2. Specific-to-general breadth-first search
3. Version space
Expert Systems sstseng 61
範例:實例的描述:an unorder pair of simple objects, characterized by
three attributes(size, color, shape)
三個實例: {(Large,Red,Triangles)(Small,Blue,Circle)} {(Large,
Blue,Circle)(Small,Red,Triangle)} {(Large,Blue,Tri
angle)(Small,Blue,Triangle)}
+
+
-
Expert Systems sstseng 62
深度優先搜尋 (Depth-first search)1.{(Large,Red,Triangle) (Small,Blue,Circle)}
2.{(Large,Blue, Circle) (Small,Red, Triangle)}
3.{(Large,Blue, Triangle)) (Small,Blue, Triangle)}
{(Large,Red,Triangle) (Small,Blue,Circle)} {(Large,Red,Triangle) (Small,Blue,Circle)}
{(Large,?,?) (Small,?,?)}
{(Large,Red,Triangle) (Small,Blue,Circle)}
{(Large,?,?) (Small,?,?)}
{(?,Red,Triangle)
(?,Blue,Circle)}
Expert Systems sstseng 63
缺點:
1. 需要返回追蹤 (backtracking)
2. 需要額外的花費在維護過去實例的一致性
Expert Systems sstseng 64
Specific-to-general breadth-first search1.{(Large,Red,Triangle) (Small,Blue,Circle)}
2.{(Large,Blue, Circle) (Small,Red, Triangle)}
3.{(Large,Blue, Triangle)) (Small,Blue, Triangle)}
{(Large,Red,Triangle) (Small,Blue,Circle)} {(Large,Red,Triangle) (Small,Blue,Circle)}
{(Large,Red,Triangle) (Small,Blue,Circle)}
{(Large,?,?) (Small,?,?)}
{(?,Red,Triangle)
(?,Blue,Circle)}
{(Large,?,?) (Small,?,?)}
{(?,Red,Triangle)
(?,Blue,Circle)}
Expert Systems sstseng 65
缺點:
Needs to check past negative instances to assure th
at the revised generalization is not overly general
Expert Systems sstseng 66
Attributes
MatchingPredicates
Hypothesis Space
TrainingInstances
Learning Unit
Symbolic Learning: determine one or several hypotheses each of which is consistent with presented training instances
Expert Systems sstseng 67
Example: assume only one attribute exists
Instance space: terminal nodes,Hypothesis space: all nodesPredicates: predecessor-successor relationsPositive Training Instances : sin and cosNegative Training Instance : ln
→ Concept : trig
transc
trig explog
sin cos tan ln exp
Expert Systems sstseng 68
Terminology• An Instance Space : a set of instances which can be legally described by a given instance language . Attribute-based Instance Space . Structured Instance Space• A Hypothesis Space : a set of hypotheses which can be legally described by a generalization language
Conjunctive Form Disjunctive Form e.g.Color=red and shape=convex C1 or C2 or C3…(most prevalent form) conjunctive form
5 kinds of expressions
Expert Systems sstseng 69
Terminology
Predicates : required for testing whether a given instance is contained i
n the instance set corresponding to a given hypothesis
• Powerful basis for organizing a search
• Two partial ordering relations exist : A is more specific (特殊) than B : B is more general (泛化) than A :If each instance contained in A is also
contained in B
Expert Systems sstseng 70
逐漸式學習演算法 (Incremental Learning) For Conjunctive Hypothesis Idea : 一個版本空間可用兩個邊界集合 S 和 G 來表示
S :代表最特殊 (Specific) 規則集 G :代表最泛化 (General) 規則集
版本空間
more general
more specific
G
S+
-
Expert Systems sstseng 71
範例
( sin + ) S : sin G : transc
( ln - ) S : sin G : trig
( cos + ) S : trig G : trig
Concept : trig
Lemma : a S, b G,
a is more specific than b
transc
trig explog
sin cos tan ln exp
Expert Systems sstseng 72
1.{(Large,Red,Triangle) (Small,Blue,Circle)}
2.{(Large,Blue, Circle) (Small,Red, Triangle)
3.{(Large,Blue, Triangle)) (Small,Blue, Triangle)}
{(Large,Red,Triangle) (Small,Blue,Circle)}
{(?,?,?) (?,?,?) }
S :
G:S :
G:
S :
G:
{(Large,?,?) (Small,?,?)}
{(?,Red,Triangle) (?,Blue,Circle) }
{(?,?,Circle) (?,?,?) }
{(?,Red,?) (?,?,?) }
{(?,?,?) (?,?,?) }
{(?,Red,Triangle) (?,Blue,Circle)}
Expert Systems sstseng 73
Check contradiction between S and G• Step1: Take a generalization s in S and a
generalization g in G. Check s with g, if g is not more general than s , mark s and g.
• Step2: Repeat step1 until each in S and G are processed.
• Step3: Discard those generalizations in S with |G| marks and those in G with |S| marks.
Expert Systems sstseng 74
Advantage :
Needs not check past instances---the reason to
apply it in our parallel learning algorithm
Expert Systems sstseng 75
ID 3
Attribute 1
Attribute 2 Attribute 2’
Value 2Value 1
Expert Systems sstseng 76
Entropy : for each attribute, calculate the entropy
E = mi=0
- n log2
+i
+i
n
+i
n n+
- n log2
-i
-i
n
+i
n-i
n+
-i
Among all the feasible attributes, the one which causes the minimum entropy will be chosen as the next attribute
Expert Systems sstseng 77
Example
COLOR
blackbrownbrownblackbrownblackbrownbrownbrownblackblackblack
SIZE
largelargemediumsmallmediumlargesmallsmalllargemediummediumsmall
COAT
shaggysmoothshaggyshaggysmoothsmoothshaggysmoothshaggyshaggysmoothsmooth
COLOR
++--+++-+---
Expert Systems sstseng 78
• For attribute color black n+ =2, n- = 4
• Brown n+ = 4, n- = 2,
• E(color) = -2 log2/6 –4 log4/6 – 4log4/6 –2log2/6
Expert Systems sstseng 79
SIZE
small
large
++++
--+-
--+-
COAT
COLOR
shaggy
smooth
--
+---
+-
medium
brown
black
Expert Systems sstseng 80
PRISM[Cendrowska , 1987]
Attribute-Value Pair
e. g. A=1, A=2, A=3, …
Instead of Attribute
Information Gain :e.g.
Minimize Number of Rules
And Number of Attributes
( )Probability of Class 1 | A =1
Probability of Class 1log2
Expert Systems sstseng 81
COLOR=black 2/6
COLOR=brown 4/6
SIZE=small 1/4
SIZE= medium 1/4
SIZE=large 4/4
COAT=shaggy 3/6
COAT=smooth 3/6
SIZE = large is chosen
SIZE =large Positive Class
Expert Systems sstseng 82
Exercise
• 試以動物分類為例,建立一個 Repertory G
rid (知識表格)及產生對應的推論規則。• 分析產生的動物分類推論規則中是否有遺
漏的 Embedded Meanings (隱含知識)。