bao cao xử lý tiếng nói - Đhbk

Upload: minh-nguyen

Post on 20-Jul-2015

604 views

Category:

Documents


7 download

DESCRIPTION

2012

TRANSCRIPT

Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 1 BI TP LN MN HC X L TING NI ti:Nng lng, bin v t l bin thin qua im khng dng trn cng trc thi gian.

Yu cu: c tn hiu ting ni t file .Wav Hin th tn hiu ting ni. Hin th Nng lng, bin v t l bin thin qua im khng dng trn cng trc thi gian. Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 2 Contents I. L THUYT CHUNG...................................................................................... 3 1. Cc c tnh c bn v ting ni ................................................................... 3 2. Cu trc file Wave: ........................................................................................ 3 2.1. RIFF file .................................................................................................. 4 2.2. Cu trc file Wave ................................................................................... 4 3. Gii thiu chung v hm nng lng thi gian ngn, bin v t l bin thin qua im khng : ...................................................................................... 7 3.1. Hm nng lng thi gian ngn v bin ............................................ 7 3.2. T l bin thin qua im khng (Zero-Crossing Rate) .........................13 II. CHNG TRNH .........................................................................................19 1. Phn tch v thit k chc nng chng trnh: ............................................19 2. Chng trnh: ..............................................................................................20 III. KT LUN ...................................................................................................24 IV. TI LIU THAM KHO ............................................................................25 Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 3 I. L THUYT CHUNG 1. Cc c tnh c bn v ting ni Ting ni m con ngi giao tip hng ngy c bn cht l sng m thanh lan truyn trong khng kh. Sng m trong khng kh l sng dc do s gin n ca khng kh.Tn hiu m thanh ting ni l tn hiu bin thin lin tc v thi gian. Di tn m tai ngi c th nghe c kh rng, t 20 n 20.000Hz, l do c tnh sinh l qui nh. Tn hiu ting ni c d tha ln do mi trng c nhiu.Thc t th trong min tn s 300 n 3400Hz ting ni nghe c kh r, y cng chnh di tn c ngi ta dng trong in thoi. Tn hiu ting nictothnhtchuiccmvlintip.Ccmvnyvccdng chuynicachngcxemnhcckhiubiudinthngtin.Ssp xp nhng m v ny c chi phi bi cc qui lut ngn ng, cho nn cc m hnh ton hc khi c p dng u phi gn b mt thit vi vic nghin cu cc qui lut ny. X l ting ni l mt lnh vc x l thng tin cha trong cc tn hiu ting ni vi mc ch truyn, lu tr, tng hp, nhn dng ting ni. X l ting ni hin nay ang c nghin cu v c vo nhiu ng dng.Ccnghincuctinhnhxltingniyucunhnghiu bit trn nhiu lnh vc v ngy a dng: t ng m, ngn ng hc cho n vic x l tn hiu .v.v 2. Cu trc file Wave: Ting ni l tn hiu tng t, lu trc trong my tnh t trng bi chui s 01ta phi lymu v lng t ho tn hiu tng t thnh tn hiu s mi lu tr c trong my tnh. Phng php ly mu v lng t ho m thanh hin nay thng l phng php PCM. Phng php ny s ly mu m thanh vi tn s khong t 11.025 kHz cho n 44.1 kHz. Mi gi tr mu c lng t ho bng 8 bits tng ng gi tr mu t128 n 127 hoc lng t ho bng 16 bits tng ng gi tr mu t 32768 n 32767. So vi lng t ho bng 8 bits th lng t ho bng 16 bits s lu tr m thanh trung thc hn nhng b li s byte lu tng gp i.Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 4 2.1. RIFF fileCutrccaWaveFilethucvolpfilecsdngbicchm MultimediacaWindows : lRIFFfile.RIFF lchvit ttcaResource InterchangeFileFormat(formatfiletraoitinguyn).MtRIFFfilegm mt hoc nhiu loi chunks, trong mi chunk li cha con tr ch n chunk k tip.Michunkbaogmloichunkvdliutheosauloichunk.Mt ng dng mun c RIFF file c th i qua ln lt tng chunk, c d liu nhng chunk n quan tm v c th b qua cc chunk m n khng quan tm. Mt chunk ca RIFF file lun bt u bi mt header c cu trc nh sau:typedef struct {FOURCC ckID;DWORD ckSize;} CK; FOURCC gm 4 bytes ch ra loi chunk. i vi Wave File, field ny c gi tr l "WAVE". Nu loi chunk t hn 4 k t th cc k t cn li bn phi s c m thm vo cc khong trng. ckSize gm 4 byte cha kch thc vng d liu ca chunk, vng d liu ny nm ngay sau header v c kch thc l ckSize bytes. Chunkcthchaccsubchunks.Subchunkcnglmtchunk.Mt RIFF file lun bt u bng mt chunk loi "RIFF". 2.2. Cu trc file Wave Wave file bt u l chunk loi "RIFF. Hai subchunk trong Wave chunk ctthngtinvmthanhcawavefilevtipldliucatng subchunk. l subchunk "fmt " v subchunk "data". a. Subchunk "fmt" Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 5 Dliuca"fmt"chunklcutrcWAVEFORMATccutrcnh sau:typedef struct waveformat_tag{WORD wFormatTag;WORD nChannels;DWORD nSamplesPerSec;DWORD nAvgBytesPerSec;WORD nBlockAlign;} WAVEFORMAT;wFormatTagthngcgitrlWAVE_FORMAT_PCMcnh ngha trong tp tin MMSYSTEM.H nh sau:#define WAVE_FORMAT_PCM 1Gi tr ny bo cho phn mm ang c Wave File bit kiu m ha d liu m thanh sang d liu s l kiu m ha PCM.nChannels c hai gi tr: bng 1 cho m thanh mono v bng 2 cho m thanh stero.nSamplesPerSecchobittclymu.Gitrthngthngca trng ny l:11025 -- 11.025 kHz 22050 -- 22.05 kHz 44100 -- 44.1 kHznAvgBytesPerSecchobitsbytetrungbnhyucutrong1giy pht li mu d liu ca sng m.nBlockAlign cho bit s byte dng cha mt mu m thanh.TathytrongWAVEFORMATchacthngtinvsbitdng lng t ha mt mu d liu ca sng m. Thc t, Wave File s xc lp s bit Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 6 dngchomtmudliubngmttrnggnvocuicutrcca WAVEFORMAT. Cu trc c nh ngha nh sau:typedef struc pcmwaveformat_tag{WAVEFORMAT wf;WORD wBitsPerSample;} PCMWAVEFORMAT;wBitsPerSample cho bit s bit trong mt mu d liu. Ch rng cc mu d liu vn phi lu tr dng byte hoc word. Do , nu mt Wave File dng 12 bit lng t ha mt mu sng m th s phi lu tr c 4 bit tha khng dng n.b. Subchunk "data" Dliuca"data"subchunkcaWaveFilechaccsliucam thanh c s ha. i vi mu m thanh 8 bit, d liu ca "data" subchunk bao gm cc gi tr 1 byte (c gi tr t 0 255) ca cc mu m thanh. i vi mumthanh16bits,mimudliugm2bytes(cgitrt-32768ti 32767). Trong mu Mono 8 bits, d liu ca subchunk "data" gm chui cc gi tr 1 bytes. Vi Stereo 8 bits, mi mu gm 2 bytes, d liu s c sp xp xen k(interleave),vibyteu(bytechn)lmumthanhcaknhbntri, byte sau (byte l) l ca knh bn phi.CU TRC FILE WAVE Kch thcGi tr 4 bytes"RIFF" 4 bytesKch thc file RIFF 4 bytes"WAVE" Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 7 4 bytes"fmt " 4 bytesKch thc subchunk "fmt " 2 bytesKiu m ha d liu ca file wave (thng l PCM) 2 bytesS knh: 1 mono; 2 stereo 4 bytesS mu/1giy 4 bytesS bytes/1 giy 2 bytesS bytes/1mu 2 bytesS bits/1mu 4 bytes"data" 4 bytesKch thc d liu D liu sng m 3.Giithiuchungvhmnnglngthigianngn,bin v t l bin thin qua im khng : 3.1. Hm nng lng thi gian ngn v bin BinLnvomnhcatnhiu,nv:dB(decibel)hayV (volts). Bin cng ln, tn hiu c cng cng mnh. V d tn hiu ting ni t Hello c th bin sau: Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 8 Qa trnh ly mu (8000 mu/ sec) cho ra tn hiu ting ni vt l nh hnh 1. D nhn thy t hnh v l cc thuc tnh ca tn hiu ting ni thay i theo thi gian. Ly v d kch thch s thay i gia tn hiu ting ni hu thanhvvthanhcsthhinsbinitrongbinxungcatnhiu (Peak amplitude) v xem xt s thay i ca tn sc bn trong vng tn hiu ting ni. Thc s cc thay i ny rt d thy trong s dng sng a ra k thut x l theo min thi gian phi c kh nng a ra cc c trng hu ch ca cc c im tn hiu nh cng , ch kch thch, cht lng m thanh v c th l cc thng s ca gii pht m nh l cc tn s Formant. iucbntronghuhttrongccsxltingnilccthuc tnh ca tn hiu ting ni thay i tng i chm so vi thi gian. iu ny ara hnglotcc phng php xl thigian ngnshort-time,trong phn short ca tn hiu ting ni c c lp v c x l nh chng l cc phn ngn t mt m ko di vi cc thuc tnh c nh. iu ny c nhc li thngxuyntheoyucu.Thngthngcconngnnycgil khungphntch(analysisframes),chngcholnonkhc.Ktquxl trn mi khung c th l mt s n hay mt b s. Cho nn nh kt qu x l mt qu trnh ph thuc thi gian mi c th phc v nh s biu din ca tn hiu ting ni. Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 9 Hnh1:Cc mucadngsngtingnictrng(tcly mu8 Khz) Hu ht cc k thut x l short-time m ta s bn trong phn ny nh s biu din Fourier short-time, c th biu din ton hc nh sau: Qn =Tx mw n mm oooo[ ( )] ( ) =(1.1) Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 10 Tn hiu ting ni (c th sau b lc tuyn tnh lm cch ly gii tn s yu cu) phi a ra s bin i, T [ ], c th tuyn tnh hay khng tuyn tnh v c th ph thuc mt s thng s iu chnh hoc tp cc thng s. Kt qu qu trnh l tng ln nhiu ln bng chui ca s nh v ti mt thi gian p ng ti ch s mu n. Kt qu l tng tt c cc gi tr khc khng. Thng thng chui ca s s hn ch trong khong thi gian, mc d iu ny khng thng xuyn nh vy. Gi tr Qn l kt qu ca mt chuiccgi tr trung bnh cc b ca chui T[x(m)]. Nnglngshort-timecamttnhiulmtmunginminhho khi nim c bn trn. Nng lng ca tn hiu ri rc theo thi gian c xc nh nh sau: E = x2( ) mm oooo=(1.2) Nh l cht lng c ngha nh hay s tin ch cho ting ni t khi n mang thng tin rt t v cc c tnh ph thuc thi gian ca tn hiu ting ni. Mt nh ngha n gin ca nng lng short-time l: En= x2( ) mmn Noo= +1(1.3) l nng lng short-time timu th n l tng bnh phng ca Nmu n-N+1 ti n. Biu din di dng tng qut, biu thc (1.1) chu k thc hin T[ ] l bnh phng v: Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 11 w(n) = 10snsN-1 = 0otherwise(1.4) Hnh2 mt s tnh toncachui nng lngshort-time.Cho ta thyrng casnhiquadctheochuicaccgitrbnhphng(tngqutl T[x(m)]) chn khong lin quan n s tnh ton nng lng. Hnh 2: Minh ho vic tnh ton nng lng short-time Chngtathyrngbincatnhiutingnithayingk theo thi gian. Thc t bin ca phn m v thanh (unvoiced) l thp hn rt nhiusovibincaphnmhuthanh(voiced).Nnglngshort-time ca tn hiu ting ni cung cp s biu din ngc vi s bin i ca bin . Mt cch tng qut chng ta c th nh ngha nng lng short-time nh sau: En = [ ( ) ( ) x m w n mm oooo=2 (1.5) Biu thc ny c th vit li nh sau: Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 12 En = x mh n mm oooo2( ). ( ) =(1.6) trong h(n) = w2(n) (1.7) Biu thc (1.6) c th c lm sng t nh m t trong hnh 4a tn hiu x2(n)clcbiblctuyntnhvipngxungh(n)aratrongbiu thc (1.7). Hnh 4: Biu din s khi ca (a) nng lng short-time v (b) ln trung bnh short-time Tcngcatnhiucastrongsbiudinnnglngph thuc theo thi gian c th c minh ho bng s cp cc c tnh ca hai tn hiu ca s a ra v d nh tn hiu ch nht: h(n) = 10 s n s N-1 0otherwise(1.8) v ca s Hamming h(n)=0.54 - 0.46 cos(2tn/(N-1)),0 s n s N-1 = 0otherwise(1.9) Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 13 ngha ch yu ca En l n a ra c s cho cc phn ting ni c bit t cc phn ting ni nhiu. Gi tr ca En cho phn m v thanh c ngha t hnphnmhuthanh.Hmnnglngcthcncdngxcnh khongthigiantithiimmhuthanhtrthnhmvthanh,vcho ting ni cht lng cao (t s tn hiu trn nhiu cao), nng lng c th c dng cho ting ni c bit t s im lng. Mtiukhkhnvihmnnglngshort-timenhnhnghabi biu thc (1.6) l n rt nhy cm vi cc mc tn hiu ln, theo cch nhn mnh s bin i ln cc mu trong x(n). Cch n gin lm nh vn ny l xc nh hm ln trung bnh: Mn = | ( )| ( ) x mw n mm oooo=(1.10) 3.2. T l bin thin qua im khng (Zero-Crossing Rate) Trong phm vi tn hiu ri rc theo thi gian, zero-crossing c ni l xut hinnuccmulintipcccduiskhcnhau.Tcxuthinti im vt im khng zero-crossing rt d o tn s ca tn hiu. iu ny rt ng trong cc tn hiu di hp. Ly v d tn hiu hnh sin tn s F0, c ly mu ti tn s Fs c mu Fs/F0 cho mi chu k ca sng hnh sin. Mi chu k c Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 14 hai ln xuyn im khng (zero crossing) nh vy ln trung bnh long-time ca zero-crossing l: Z = 2F0/Fscrossings/sample(2.1) Nh vy ln trung bnh zero-crossing a ra cch d tnh hp l tn s ca sng hnh sin. Tnhiutingniltnhiudirngvsthhinlntrungbnh zero-crossingkm chnh xc hn. D sao s tnh ton ban u ca c tnh ph ctharavicsdngvicbiudindavolntrungbnhzero-crossing thi gian ngn. Trc khi bn v s biu din ln zero-crossing cho tingni,chngtahyxcnhvbnvcctnhtonyucu.Mtnh ngha thch hp l: Zn = |sgn[ ( ) sgn[ ( )]| ( ) x m x m w n mm oooo =1 (2.2) trong sgn[x(n)]=1x(n) > 0 = -1x(n) < 0(2.3) v w(n)= 1/2N0 s n s N-1 =0otherwise(2.4) Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 15 Cc php ton trong biu thc (2.2) c a ra trong s khi nh hnh5. Sbiudinnychotathyrnglntrungbnhzero-crossingthigian ngn c mt s c tnh ging nh nng lng thi gian ngn v ln trung bnhthigianngn.Dsaobiuthc(2.2)vhnh5chotatnhZnphctp hn. D sao c yu cu l kim tra cc mu trong cc cp xc nh ch nozero-crossingxuthinvsaugitrtrungbnhctnhquaNmu lin tip. Hnh 5: S khi biu din zero-crossing trung bnh short-time By gi chng ta hy xem ln trung bnh zero-crossing thi gian ngn pdngnhthnotrongtnhiutingni.Kiutotingniaranng lng mhu thanhtp trung di 3 Khz bi v ph b chn bi thanh mn, ngc li cho m v thanh, hu ht nng lng ch thy cc tn s cao hn. Khi cc tn s cao tc l ln zero-crossing cao v tn s thp tc l ln zero-crossing thp, c mt tng quan cht ch gia ln zero-crossing v s phnbnnglngtheotns.Mtkhiquthophhplnuln zero-crossing cao, tn hiu ting ni l m v thanh, trong khi nu ln zero-crossing thp, tn hiu ting ni l m hu thanh. iu ny d sao cng th hin chnh xc bi v chng ta khng ni ci g cao, ci g thp v tt nhin n khng th l chnh xc. Hnh6 cho ta thy biu ln trung bnh zero-crossing (trung bnh qua 10 ms) cho c hai loi m hu thanh v m v thanh. ThyrngngcongGaussaraiuchnhnghontonchomis phnb.lntrungbnhzero-crossingl49chomi10msivimv thanh v 14 cho mi 10 ms i vi m hu thanh. R rng rng hai phn b gi Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 16 ln nhau nh vy m hu thanh r rng/m v thanh quyt nh khng th da vomtmnh ln trung bnhzero-crossingthi gianngn.Tuynhin nh s biu din thc s c chtrong vic thc hin s phn bit ny. Hnh6:Phnbzero-crossingtrongtingnicmhuthanhvmv thanh. Mts v dvo lntrung bnhzero-crossingcthhin trong hnh7.Trongccvdnykhongthigiancacastrungbnhl15ms (150 mu ti tn s ly mu 10 Khz) v u ra c tnh 100 time/sec (ca s di chuyn theo bc 100 mu) Thy rng ch nh trng hp nng lng thi gian ngn v ln trung bnh, ln trung bnh zero-crossing c th c ly mu ti t l rt thp. Mc d t l zero-crossing thay i ng k, min m hu thanh v m v thanh hi nh ln trong hnh 7. C mt s lu tm thc t trong vic thc hin biu din da vo ln trung bnh zero-crossing thi gian ngn. Mc d thut ton c bn tnh zero-crossing ch yu cu so snh du ca mt cp mu lin tip, c bit phi quan Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 17 tm n trong vic ly mu b bin i tng t sang s, m tn s 60 Hz trong tn hiu v bt c nhiu g c th a ra trong h thng s. Bi vy s quan tm cui cng phi c a ra trong x l tng t trc khi ly mu ti vic lm gim chnh xc kt qu. Ly v d n thng xuyn c th ph hp hn dng blcthngdihnlblcthngthp,nhblcchngliccbdanh c sinh ra dng dc v cc thnh phn 60 Hz trong tn hiu. Hn na s ch trong vic o zero-crossing l chu k lymu T v khong trung bnh N. Chu k ly mu xc nh tnh thi gian ca biu din zero-crossing; v d nh cch tnh tt th yu cu t l ly mu cao. D sao bo m thng tin zero-crossing ch lng t ho 1 bit c yu cu tt c. Hnh 7: Tc zero-crossing trung bnh cho ba cch pht m khc nhau. Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 18 Bivgiihnthct,rtnhiucchbiudintngtcara. Tt c s thay i ny a ra mt s c im thc hin d tnh t nh hng vo nhiu. Ni bt ln trong s ny l s biu din up-crossing c thc hin bi Baker. S biu din ny da theo khong thi gian gia zero-crossing xut hin vi sn dng. Baker p dng biu din ny trong phn loi cc ng m ca ting ni. Mtngdngkhccasbiudinzero-crossinglmtbctrung gianngintrongvictoramintnsbiudintingni.Tintithc hinlcthnggiicatnhiutingnitrongmtviditnsgnnhau. Nnglngshort-timevsbiudinzero-crossingctochourab lc. Cc s biu din ny cng vi s biu din hi tri ngc vi c tnh ph ca tn hiu. Nh s tip cn ca Reddy v s nghin cu ca Vicens v Erman l c s cho h thng nhn dng ting ni di rng (high-scale). Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 19 II. CHNG TRNH 1. Phn tch v thit k chc nng chng trnh: Chng trnh c thit k gm ba chc nng chnh: + c file wave. + Hin th file wave. + Hin th nng lng, bin v t l bin thin qua gi tr khng dng trn cng trc thi gian. S khi mc nh: D liu

S khi chc nng mc nh. c file wave Hin th E, bin , t l Hin th file wave Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 20 2. Chng trnh: Chng trnh c vit bng ngn ng java, chy trn IDE Netbean 7. Thit k chng trnh: Chng trnh gm 3 packages: -file: cha file wave. -icon: cha cc icon giao din. -xltn: cha cc modul, class chnh. Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 21 Trong c nhng class v nhng hm chnh sau: Class data: cha cc phng thc, mng d liu. Class DrawGraphics: cc hm v. Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 22 About, Bonus, Extend: cc class ph thit k giao din. Giao din: Khi m chng trnh Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 23 Sau khi load: Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 24 III. KT LUN Vi mc nh l tm hiu cn bn v phn tch ting ni ng thi nng caoknnglptrnh,dokinthcvkhnngchn,mcdcthy hng dn nhiu, nhng vn khng th trnh khi cn nhiu ch cha hiu v thiust,chngemmongthychdynhiuhn.Chngemxinchnthnh cm n thy. Bi tp ln nhm 9 Thy Trnh Vn Loan X l ting niPage 25 IV. TI LIU THAM KHO [1].Trnh Vn Loan, Bi ging x l ting ni[2].F.J.Owens, Signal Processing of Speech.