Transcript
  • Dch my thng kBi:

    Wiki Pedia

    Dch my thng k (SMT) l mt phng php dch my, trong cc bn dch cto ra trn c s cc m hnh thng k c cc tham s c bt ngun t vic phn tchcc cp cu song ng. Cc phng php tip cn thng k tng phn vi cc phngphp tip cn da trn lut trong dch my cng nh vi dch my da trn v d.

    Nhng tng u tin ca dch my thng k c gii thiu bi Warren Weavervo nm 1949, bao gm c nhng tng ca vic p dng l thuyt thng tin caClaude Shannon. Dch my thng k c ti gii thiu vo nm 1991 bi cc nhnghin cu lm vic ti Trung tm nghin cu Thomas J. Watson ca IBM v gpphn ng k trong s hi sinh vic quan tm n dch my trong nhng nm gn y.Ngy nay n l phng php dch my c nghin cu nhiu nht.

    C s

    tng ng sau dch my thng k n t l thuyt thng tin. Ti liu c dch theophn b xc sut p(e | f) trong e l ngn ng ch (v d, Ting Vit) dch t f l ngnng ngun (v d, Ting Anh).

    Cc vn ca m hnh phn phi xc sut p(e | f) c tip cn theo mt s cch.Mt cch tip cn trc quan l p dng nh l Bayes, l :

    trong p(f | e) l xc sut chui ngun (f) l bn dch ca chui ch e, xc sut nygi l m hnh dch, v p(e) l xc sut chui e thc s xut hin trong ngn ng ch,xc sut ny gi l m hnh ngn ng. Phn tch ny gip tch cc vn thnh hai biton con. Bn dch tt nht c tm bng cch chn ra bn c xc sut cao nht:

    .

    p dng phng php ny mt cch y , cn thc hin vic tm kim trn tt ccc chui e * ca ngn ng ch. Khi lng tm kim ny rt ln, v nhim v thc

    Dch my thng k

    1/4

    www.princexml.comPrince - Non-commercial LicenseThis document was created with Prince, a great way of getting web content onto paper.

  • hin tm kim hiu qu l cng vic ca mt b gii m dch my, s dng nhiu kthut hn ch khng gian tm kim nhng vn gi cht lng dch thut chp nhnc. K thut nh i gia cht lng v thi gian tnh ton cng c th c tmthy trong nhn dng ting ni.

    Do h thng dch khng th lu tr tt c cc chui ngun v bn dch ca chng, mtti liu thng c dch tng cu mt, nhng ngay c vic lu tt c cu cng khngkh thi. M hnh ngn ng thng c tnh xp x bng m hnh n-gram, v cch tipcn tng t c p dng cho m hnh dch, nhng c thm s phc tp do dicu v th t t khc nhau trong cc ngn ng.

    Cc m hnh dch thng k ban u thng dng m hnh ly c s theo t (m hnh 1-5m hnh Markov n ca IBM ca Stephan Vogel v M hnh 6 ca Franz-Joseph Och),nhng nhng tin b ng k c thc hin t khi c m hnh ly c s theo cmt. Cc cng trnh nghin cu gn y kt hp c php hoc cu trc bn-c php lm tng cht lng dch .

    Dch my thng k trn c s t

    Trong dch my thng k trn c s t, cc n v c bn ca bn dch l mt t trongngn ng t nhin. Mt v d v mt h thng dch my thng k trn c s t l phnmm t do Giza++ (giy php GPL), dng tp hun cho cc m hnh dich IBM, mhnh HMM v m hnh 6 .

    Dch my thng k trn c s t khng s dng rng ri ngy nay, thay vo l dchmy thng k trn c s cm t. Hu ht cc h thng da trn cm t vn cn s dngGiza++ ging hng cu, trch rt ra cc cp cu song ng v m hnh ngn ng . Vnhng u th ca Giza++, hin nay c mt s n lc a p dng tnh ton phn tn trctuyn cho phn mm ny.

    Dch my thng k trn c s cm t

    Dch my thng k trn c s cm t c mc ch l gim bt cc hn ch ca dchmy thng k trn c s t bng cch dch cm t, trong di cm t ngun vcm t ch c th khc nhau. Cc cm t trong k thut ny thng khng cm t theongha ngn ng hc m l cc cm t c tm thy bng cch s dng phng phpthng k trch rt t cc cp cu. Vic s dng cc cm t theo ngha ngn ng hc(tc l da trn c php, xem phn loi c php) lm gim cht lng ca dch mybng phng php ny.

    Dch my thng k

    2/4

  • Dch my thng k trn c s c php

    Dch my thng k trn c s c php da trn tng ca dch cc n v c php(phn tch cy ca cu), hn l nhng t n hay cm t (nh trong dch my thng ktrn c s cm t). tng ny xut hin t lu, tuy nhin phin bn thng k ca tng ny ch c hnh thnh khi c nhng b phn tch ngu nhin mnh m trongnhng nm 1990.

    Li ch

    Nhng li ch thng xuyn c trch dn ca dch my thng k trn m hnh truynthng l:

    S dng tt hn cc ngun ti nguyn

    C rt nhiu ngn ng t nhin c d liu nh dng my c c. Ni chung, h thng SMT khng b b hp vo mt cp ngn ng c th no. Dch my da trn lut i hi vic xy dng cc quy tc ngn ng, c th tn

    km, v thng khng khi qut c cho cc ngn ng khc.

    Cc bn dch t nhin

    Vn

    Ging hng cu

    Trong khi phng php dch my thng k da trn nhng cp cu song ng, th mtcu trong ngn ng ny c th c dch ra nhiu cu khc nhau trong ngn ng khcv ngc li. Vic ging hng cu c th c thc hin thng qua cc thut ton ginghng Gale-Church.

    Thnh ng

    Ty thuc vo b cp cu s dng, cc thnh ng c th khng c dch thot nghahay theo ngha bng, n ngha ca chng. V d, bng cch s dng b cp cu CanadaHansard, "hear" lun c dch l "Bravo!" v trong t "Hear, hear!" trong ng cnh hpquc hi c dch l "Bravo!".

    Khc bit trong th t t

    Th t t trong cc ngn ng l khc nhau. Mt s ngn ng c th c phn loi bngcch t tn theo th t in hnh ca ch ng (S), ng t (V) v i tng (O) trongmt cu v c th c cc ngn ng theo dng, chng hn, SVO hoc VSO. Ngoi ra cn

    Dch my thng k

    3/4

  • c thm s khc bit trong th t t, v d, khi c nhng yu t ng php ph tr, v dth t t ca cu hi khc cu khng nh.

    gii quyt vn sp xp th t t, nhiu bn dch ng vi cc th t t khc nhauc th c sinh ra, sau cc bn dch ny c xp hng v xc sut xut hin, vi sgip ca m hnh ngn ng, v bn dch c xc sut cao nht c th c la chn.

    T nm ngoi kho t vng

    H thng dch my thng k lu tr cc cm t mt cch c lp, khng c mi quan hno gia cc cm t. Nhng cm t khng c trong d liu s khng c dch. Vn ny s gp phi khi thiu d liu, hoc h thng c s dng trong lnh vc kin thcmi.

    Dch my thng k

    4/4

    Dch my thng kC sDch my thng k trn c s tDch my thng k trn c s cm tDch my thng k trn c s c php

    Li chVn Ging hng cuThnh ngKhc bit trong th t tT nm ngoi kho t vng


Top Related