th2006_project_3_2

Post on 22-Jul-2015

104 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mn: Tr tu nhn to

Lp TH2006/01,02

Bi tp ln 3

CC THUT TON HC MYNi dungXy dng b phn lp d liu da trn cc thut ton hc my hc: Cy quyt nh v Nave Bayes.

Mc tiuSinh vin xy dng cy quyt nh v Nave Bayes, da trn d liu hun luyn c th, ng thi lm quen vi cng c khai thc d liu m ngun m Weka. Bi tp ny gm hai phn. Phn u tin v l thuyt cy quyt nh v Nave Bayes. Phn th hai l cc bi tp thc hnh trn cng c khai thc d liu Weka (http://www.cs.waikato.ac.nz/ml/weka/).

Phn I Cy quyt nh v Nave Bayes (6 im)Mc tiu ca bi tp ny l lm quen vi phng php hc my Cy quyt nh v Hc xc sut (Nave Bayes). Sinh vin s thc hnh trn mt tp d liu n gin nh bn di a ra quyt nh c i trt tuyt hay khng. Quyt nh ny a ra da trn cc thuc tnh snow (tnh trng tuyt ri), weather (tnh trng thi tuyt), season (c phi ma cao im hay khng), v physical condition (iu kin c th).snow sticky fresh fresh fresh fresh weather season physical condition go skiing foggy sunny sunny sunny sunny low low low high mid high low mid low low low low rested injured rested rested rested tired rested rested rested rested rested rested no no yes yes yes no yes no yes yes yes yes

frosted windy sticky sunny

frosted foggy fresh fresh fresh fresh windy windy foggy foggy

GVHDTH: Nguyn Ngc Tho, Nguyn Hi Minh, V Thanh Hng, Quch Kh Gia

1

Mn: Tr tu nhn tosticky sunny mid low rested injured yes no

Lp TH2006/01,02

frosted foggy

a. Xy dng cy quyt nh da vo d liu trong bng trn bng cch tnh hn lon trung bnh cho tng nt c th c. Trnh by vo bo co tt c cc bc tnh ton v cy quyt nh kt qu. Gi : Nhc li cng thc tnh entropy v average entropy Entropy = p.log2p (1 p).log2(1 p) vi p l t l cc mu dng Average Entropy =v vValue ( A)

trong tp hun luyn.

p H

Av

b. p dng phng php hc xc sut Nave Bayes vo bng d liu trn, vi mi thuc tnh, hy thng k cc ch s sau v trnh by vo bng. S lng mu cha thuc tnh ng vi tng phn lp. Xc sut ca cc gi tr thuc tnh ng vi tng phn lp. Cc gi tr xc sut c lm trn Laplace.

Gi : cng thc tnh xc sut iu kin trong trng hp c hai phn lp:

P(C1 | X ) =

P ( X | C1 ) P(C1 ) P ( X | C1 ) P(C1 ) + P( X | C 2 ) P(C 2 )

vi C1, C2 l hai phn lp, v X l tp cc s kin c cung cp. V d: xt mt bng d liu n gin nh sausnow fresh sticky sticky fresh fresh weather go skiing foggy windy sunny windy foggy no no yes yes yes no

frosted sunny

Thc hin thng k theo yu cu ca cu b, ta c c bng kt qu nh sau:Snow Count yes fresh 2 frosted 0 sticky 1 no 1 1 1 Probability yes fresh 2/3 frosted 0/3 sticky 1/3 no 1/3 1/3 1/3 Laplace yes fresh 3/6 frosted 1/6 sticky 2/6 no 2/6 2/6 2/6

GVHDTH: Nguyn Ngc Tho, Nguyn Hi Minh, V Thanh Hng, Quch Kh Gia

2

Mn: Tr tu nhn toWeather foggy sunny windy yes 3 yes 1 1 1 no 1 1 1 no 3 yes no foggy 1/3 1/3 sunny 1/3 1/3 windy 1/3 1/3 yes no 3/6 3/6

Lp TH2006/01,02yes foggy 2/6 sunny 2/6 windy 2/6 yes 4/8 no 2/6 2/6 2/6 no 4/8

Skiing

Vi mt mu mi snow = fresh and weather = sunny, th xc nh xem xc sut khng i trt tuyt l bao nhiu? P(C1) P(C2) P(X|C1) P(X|C2) = P(go skiing = no) = 4/8 = P(go skiing = yes) = 4/8 = P(snow = fresh and weather = sunny | go skiing = no) = 2/6*2/6 = 1/9 = P(snow = fresh and weather = sunny | go skiing = yes) = 3/6*2/6 = 1/6 Vy xc sut khng i trt tuyt l

P(C1 | X ) =

P( X | C1 ) P (C1 ) 1/ 9 * 4 / 8 2 = = = 40% P ( X | C1 ) P (C1 ) + P( X | C 2 ) P (C 2 ) 1 / 9 * 4 / 8 + 1 / 6 * 4 / 8 5

c. Hy quyt nh xem mt s kin c cc gi tr thuc tnh nh bng bn di th c i trt tuyt hay khng, da vo b phn lp Cy quyt nh v Nave Bayes. Trnh by chi tit tng bc suy lun (v d cn c vo nt no trn cy quyt nh hoc tnh xc sut nh th no trong Nave Bayes)snow weather season physical condition go skiing sticky windy high tired ?

d. Thc hin tng t cu c nhng lu lc ny mu c gi tr thiu.snow weather season physical condition go skiing windy mid injured ?

GVHDTH: Nguyn Ngc Tho, Nguyn Hi Minh, V Thanh Hng, Quch Kh Gia

3

Mn: Tr tu nhn to

Lp TH2006/01,02

Phn II Thc hnh vi Weka (4 im)Cng c s dng trong bi tp ny gi l WEKA, mt phn mm c xy dng da trn ngn ng Java. Cng c ny rt trc quan v d s dng. Hy ti phn mm (nn dng phin bn GUI) t trang web www.cs.waikato.ac.nz/ml/weka/index.html v ci t theo nh hng dn. thc hin bi tp ny, sinh vin ch cn nm cch s dng WEKA Explorer (xem hng dn s dng Explorer trong th mc ci t v trn website ca WEKA). Hy to tp tin ARFF t bng d liu v trt tuyt trong phn I s dng trong chng trnh WEKA a. Hy s dng tp d liu va c to thc nghim hai thut ton Cy quyt nh v Nave Bayes bng WEKA Thc nghim cy quyt nh: chn thut ton ID3, v li cy quyt nh. Thc nghim Nave Bayes: chn thut ton NaiveBayesSimple, ghi Cc b phn lp do WEKA xy dng c ging vi kt qu tnh c t Trong c hai trng hp thc nghim, bng cch chn kiu test l

nhn li cc gi tr xc sut m WEKA tnh c. l thuyt khng? Nu khng th ti sao? Cross-validation vi s fold mc nh l 10, hy cho bit: t l chnh xc ca b phn lp, s mu phn lp ng/sai/khng phn lp c, din gii li phn b d liu theo thng tin t Confusion matrix. b. Hy phn lp cho hai mu mi trong cu c v d ca Phn I bng hai phng php: Cy quyt nh v Nave Bayes. Kt qu do WEKA d on c ging nh kt qu tnh ton theo l thuyt khng? Nu khng th ti sao? Vi phng php ID3 bn c d on c mu mi khng? Ti sao, hy xut cch gii quyt nu c. Gi : d on mu mi, to file test.arff c cu trc ging nh tp hun luyn, phn no cn d on th du ? (gi tr thiu), lc chy chn More Options Output predictions WEKA hin th kt qu d on.

GVHDTH: Nguyn Ngc Tho, Nguyn Hi Minh, V Thanh Hng, Quch Kh Gia

4

Mn: Tr tu nhn to

Lp TH2006/01,02

THI HN V YU CU NP BIThi gian: 2 tun (02/12/2008 15/12/2008) Quy nh np: Np bi qua Moodle theo thi hn ch nh, cu trc bi np nh sau: Document: trnh by ni dung tr li cho cc cu hi trong bi.

GVHDTH: Nguyn Ngc Tho, Nguyn Hi Minh, V Thanh Hng, Quch Kh Gia

5

top related