創成シミュレーション工学専攻計算システム工学分野　徳田・李研究室...

Statistical Models of Machine Translation, Speech Recognition, and Speech Synthesis for Speech-to-Speech Translation

Speech-to-Speech Translation

1/3*

2/3

*

3/3

*

TOEIC600 90% *[Sugaya et al., 01]

[Brown et al., 93]

*

[Koehn et al., 2003]

*

IST-ITG[Yamamoto et al., 08](Imposing Source Tree on Inversion Transduction Grammar)

*

-*

BaselineIST-ITGProposedBLEU-427.8729.3129.80

Source:From results of the consideration, it was pointed that radiation from the loop elements was weak.Reference:IST-ITG:Proposed:*

[Black et al., 96]

[Tokuda et al., 00] *

Hidden Markov Model; HMM [Lee, 90]

*

[Young, 94]*

Maximum Likelihood; ML *

1/2*MLBayes

2/2* [Attias; 99]

*

10205

ML-MDLMLMDLBayes-MDLML-BayesMLBayes-Bayes

*1,1281,1289,4859,485

*5,4295,42914,61014,610

*MLBayes

1*

Iteration0Iteration1Iteration0Iteration2Iteration1Iteration3Iteration2

[Ney, 99]

*

Amazon Mechanical TurkSection 1: NaturalnessSection 2: WERS2ST-AdequacyS2ST-FluencySection 3: MT-AdequacyMT-Fluency150*

Finnish-to-EnglishHiFST ()865,732208,129 100*

5*

N-bestMT output sentenceSpeech1We support what you have said.2We support what you said.3We are in favour of what you have said.4We support what you said about.5We are in favour of what you said.We can support what you said.

*

MT-AdequacyMT-FluencyNaturalness0.120.24

MT-AdequacyMT-FluencyWER0.170.25

N-gramN-gramN-13-gram

N-gram

N-gramN-gram*P|=0.3P|=0.2

1-gram2-gram3-gram4-gram5-gramMT-Fluency0.280.390.420.430.44

5-gram5-gram: 0.87 *

N-gramN-gram

N-gramN-gram

*

1-gram2-gram3-gram4-gram5-gramNaturalness0.050.150.190.200.18

4-gram4-gram: 0.81 *

N-gram N-gram

*

BLEU0.49*

1/2 32Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro Sumita, Keiichi Tokuda, A reordering model using a source-side parse-tree for statistical machine translation, IEICE, Vol.E92-D,No.12, pp.23862393, Dec. 2009.Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, Bayesian context clustering using cross validation for speech recognition, IEICE, Vol.E94-D, No.3, Mar. 2011Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, Speech recognition based on statistical models including multiple model structures, Acoustical Science and Technology Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Takashi Masuko, and Keiichi Tokuda, Bayesian speech synthesis,

*

2/2 117 41 127

*

1/8 32Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro Sumita, Keiichi Tokuda, A reordering model using a source-side parse-tree for statistical machine translation, IEICE, Vol.E92-D,No.12, pp.23862393, Dec. 2009.Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, Bayesian context clustering using cross validation for speech recognition, IEICE, Vol.E94-D, No.3, Mar. 2011Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, Speech recognition based on statistical models including multiple model structures, Acoustical Science and Technology conditional acceptedKei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Takashi Masuko, and Keiichi Tokuda, Bayesian speech synthesis,

*

2/8 117Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, "Hyperparameter estimation for speech recognition based on variational Bayesian approach," in Proc. ASA&ASJ Joint Meeting, 1pSC32, pp.3042, USA, Honolulu, 2006.12Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, "Bayesian context clustering using cross valid prior distribution for HMM-based speech recognition," in Proc. Interspeech, pp.936--939, 2008.9Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, "Acoustic modeling based on model structure annealing for speech recognition," in Proc. Interspeech, pp.932--935, 2008.9 Tatsuya Ito, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, "Speaker recognition based on variational Bayesian method," in Proc. Interspeech, pp.1417--14202008.9*

3/8Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Takashi Masuko, and Keiichi Tokuda, "A Bayesian approach to HMM-based speech synthesis," in Proc. ICASSP, pp.4029--4032, 2009.4Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro Sumita, and Keiichi Tokuda, "Reordering model using syntactic information of a source tree for statistical machine translation," in Proc. NAACL-HLT Workshop SSST-3, pp.69--77, 2009.6Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda, "A Bayesian approach to hidden semi Markov model based speech synthesis," in Proc. Interspeech 2009, pp.1751--1754, 2009.9 (Student paper award finalist)Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda, "Deterministic annealing based training algorithm for Bayesian speech recognition," in Proc. Interspeech 2009, pp.680--683, 2009.9*

4/8Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda, Bayesian speech synthesis framework integrating training and synthesis processes," in Proc. SSW7, pp.106--111, 2010.9Keiichiro Oura, Kei Hashimoto, Sayaka Shiota, and Keiichi Tokuda, Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2010," in Proc. Blizzard Challenge 2010, 2010.9Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda, "An analysis of machine translation and speech synthesis for speech-to-speech translation, " in Proc. ICASSP 2011 (accepted)

*

5/8 41 "" , , vol.107, no.165, pp67--72, 2007.7Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, and Keiichi Tokuda, "Bayesian Context Clustering Using Cross Validation for HMM-Based Speech Synthesis," , , Vol.108, No.338, pp.73--78, 2008.12Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, "Speech Recognition Based on Statistical Models Including Multiple Decision Trees," , , Vol.108, No.338, pp.221--226, 2008.12 Tatsuya Ito, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, and Keiichi Tokuda, "Speaker Recognition Based on Gaussian Mixture Models Using Variational Bayesian Method," , , Vol.108, No.338, pp.185--190, 2008.12*

6/8 127 "" , , pp139--142, 2007.9 "" , , pp142--146, 2007.9 "" , , pp.69--70, 2008.3 "" , , pp.125--126, 2008.3 "" , , pp.143--144, 2008.3 *

7/8 "HMM" , , pp.251--252, 2008.9 "" , , pp.303--304, 2009.3 "HSMM" , , pp.257--258, 2009.9 "Training Algorithm Based on Deterministic Annealing for Bayesian Speech Recognition" , , pp.3--6, 2009.9 " , , pp.243--244, 2010.9

*

8/8William Byrne, Simon King, , 20113 " , 20113

*

3/3

*

***********

創成シミュレーション工学専攻 計算システム工学分野 徳田・李研究室...

Documents

創成シミュレーション工学専攻計算システム工学分野　徳田・李研究室...