a method of speech waveform synthesis based on wavenet considering speech generation process

36
音音音音音音音音音音音 WaveNet 音音音音音音音音音音音 玉玉 玉 玉玉 玉玉 玉玉 玉玉 玉玉玉玉玉

Upload: akira-tamamori

Post on 19-Mar-2017

681 views

Category:

Engineering


4 download

TRANSCRIPT

SP

WaveNet

1

2

:::: :

Feed-Forward [Zen et al., 13] , LSTM-RNN [Zen et al., 15] WaveNet [van den Oord et al., 16]

WaveNet

WaveNet

WaveNet

WaveNet

Fant, 60

overlap/shift

Vocoder

overlap/shift

Frame-by-Frame

WaveNet

WaveNet [van den Oord et.al, 16]

Causal dilated convolution , residual, skip-connection softmax

Causal Dilated ConvolutionCausal : dilation

1dilation=12dilation=23dilation=4dilation=8

sigmDilated Conv.11 Conv.

tanh

tanh

sigmDilated Conv.11 Conv.

tanh

sigmDilated Conv.11 Conv.Causal Conv.

ReLUSkip-connectionsResidualResidual11 Conv.

11 Conv.ReLUInputOutput

Softmax

Residual

Residual

Residual

Dilated Conv.11 Conv.

tanh

sigmDilated Conv.11 Conv.

tanh

sigmDilated Conv.11 Conv.Causal Conv.

ReLUSkip-connectionsResidualResidual11 Conv.

11 Conv.ReLUInputOutput

Softmax

Residual

ResidualGated activationCausal Dilated Conv.ResidualSkip-connection

Softmax 16bit65,536 16bit8bit

Softmax

WaveNet

WaveNetSoftmax

WaveNet

WaveNet

WaveNet

20

1

2

3

4

Residual BlockResidualBlock #3ResidualBlock #4ResidualBlock #2ResidualBlock #1

ResidualBlock #1

1dilation=12dilation=23dilation=4dilation=8

Sample-by-Sample Frame-by-Frame

WaveNet

CMU-ARCTIC SLT1082 50 16 kHz5 ms25 ms0 24

Adam; Dilation1, 2, .... , 512 3 30Dilated Conv.11 Conv.

tanh

sigmDilated Conv.11 Conv.

tanh

sigmDilated Conv.11 Conv.Causal Conv.

ReLUSkip-connectionsResidualResidual11 Conv.

11 Conv.ReLUInputOutput

Softmax

Residual

ResidualGated activation256ch256ch

30 = Causal dilated convolution 30 2048ch2048ch256ch2048ch

SNRSNR

SDR

:

:

:

:

: : :

NothingMcepMcep + F0

WaveNet

FFT

dB; 5%

SNRSDRMcepNothing

Mcep+F0

Raw

McepTest

Plain-MLSAFFTMLSA STRAIGHT-MLSASTRAIGHT1MLSA 2Plain-WaveNetFFTWaveNetSTRAIGHT-WaveNetSTRAIGHTWaveNet

1 STRAIGHT 2 MLSA

SNR

SNR

STRAIGHT-WaveNet

SNR

STRAIGHT-WaveNet

SNR

STRAIGHT-WaveNetRaw

SDR

STRAIGHT-MLSA

STRAIGHT-WaveNet

WaveNetSNRSDRSTRAIGHT