recurrent neuronal network tailored for weather radar nowcasting

45
Eawag: Swiss Federal Institute of Aquatic Science and Technology Recurrent Neuronal Network tailored for Weather Radar Nowcasting June 23, 2016 Andreas Scheidegger

Upload: andreas-scheidegger

Post on 14-Apr-2017

107 views

Category:

Science


1 download

TRANSCRIPT

Eawag: Swiss Federal Institute of Aquatic Science and Technology

Recurrent Neuronal Network tailored forWeather Radar Nowcasting

June 23, 2016

Andreas Scheidegger

Andreas Scheidegger – Eawag

Weather Radar Nowcasting

t0t

-1t-2t

-3t-4

t5t

4t3t

2t1

nowcastmodel

available inputs:every pixel of the previous images

desired outputs:every pixel of the next n images

Andreas Scheidegger – Eawag

Recurrent ANN

Pt

Ht Ht+1

ANN

Pt+1^

last observedimage

Andreas Scheidegger – Eawag

Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2

ANN

Pt+1^ Pt+2

^

last observedimage

Andreas Scheidegger – Eawag

Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2 Ht+3

Pt+3^

ANN ANN

Pt+1^ Pt+2

^

last observedimage

Andreas Scheidegger – Eawag

Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+1^ Pt+2

^

last observedimage

Andreas Scheidegger – Eawag

Model structure

Pt

Ht Ht+1

ANN

Pt+1^

v j=σ(∑k

w jkuk+b j)

Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:

Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:

How can we do better?

Andreas Scheidegger – Eawag

Model structure

Pt

Ht Ht+1

ANN

Pt+1^

v j=σ(∑k

w jkuk+b j)

Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:

Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:

How can we do better?

tailor model structure to problem

Andreas Scheidegger – Eawag

Model structure

Pt

Ht Ht+1

ANN

Pt+1^

Andreas Scheidegger – Eawag

Model structure

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected

spatialtransformer deconvolution

^

Gaussian blur

fullyconnected

Andreas Scheidegger – Eawag

Spatial transformer

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected

spatialtransformer deconvolution

^

Gaussian blur

fullyconnected

Jad

erb

erg

, M

., S

imo

nya

n,

K.,

Zis

serm

an

, A.,

an

d K

avu

kcu

oglu

, K

. (2

01

5)

Sp

atia

l Tra

nsf

orm

er

Ne

two

rks.

arX

iv:1

50

6.0

20

25

[cs

].

Andreas Scheidegger – Eawag

Spatial Transformer

input output

Jaderberg, M., Simonyan, K., Zisserman, A., and Kavukcuoglu, K. (2015) Spatial Transformer Networks. arXiv:1506.02025 [cs].

Andreas Scheidegger – Eawag

Spatial transformer

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected

spatialtransformer deconvolution

^

Gaussian blur

fullyconnected

Andreas Scheidegger – Eawag

Local correction

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected

spatialtransformer deconvolution

^

Gaussian blur

fullyconnected

Andreas Scheidegger – Eawag

Local correction

Andreas Scheidegger – Eawag

Local correction

Andreas Scheidegger – Eawag

Gaussian blur

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected

spatialtransformer deconvolution

^

Gaussian blur

fullyconnected

Andreas Scheidegger – Eawag

Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+1^ Pt+2

^

last observedimage

Andreas Scheidegger – Eawag

Model structure

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected

spatialtransformer deconvolution

^

Gaussian blur

fullyconnected

Andreas Scheidegger – Eawag

PredictionForecast horizon: 2.5 minutes

observed predicted

Andreas Scheidegger – Eawag

Prediction

observed predicted

Forecast horizon: 15 minutes

Andreas Scheidegger – Eawag

Prediction

observed predicted

Forecast horizon: 30 minutes

Andreas Scheidegger – Eawag

Prediction

observed predicted

Forecast horizon: 45 minutes

Andreas Scheidegger – Eawag

Prediction

observed predicted

Forecast horizon: 60 minutes

Andreas Scheidegger – Eawag

Prediction

Andreas Scheidegger – Eawag

Conclusions

Andreas Scheidegger – Eawag

Deep learning may be a hype – but it's a useful tool nevertheless!

Conclusions

Andreas Scheidegger – Eawag

v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling

Deep learning may be a hype – but it's a useful tool nevertheless!

Conclusions

Andreas Scheidegger – Eawag

and Outlook

v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling

Deep learning may be a hype – but it's a useful tool nevertheless!

Conclusions

Andreas Scheidegger – Eawag

and Outlook

v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling

θt+1=θt−λ ∇ L(θt)Online parameter adaptionDeep learning may be a

hype – but it's a useful tool nevertheless!

Conclusions

Andreas Scheidegger – Eawag

and Outlook

v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling

θt+1=θt−λ ∇ L(θt)Online parameter adaption

L(θ)Other objective function?

Deep learning may be a hype – but it's a useful tool nevertheless!

Conclusions

Andreas Scheidegger – Eawag

and Outlook

v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling

θt+1=θt−λ ∇ L(θt)Online parameter adaption

L(θ)Other objective function?

Deep learning may be a hype – but it's a useful tool nevertheless!

Include other inputs

Conclusions

Andreas Scheidegger – Eawag

and Outlook

v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling

θt+1=θt−λ ∇ L(θt)Online parameter adaption

L(θ)Other objective function?

Deep learning may be a hype – but it's a useful tool nevertheless!

Include other inputs

Predict prediction uncertainty

Conclusions

Andreas Scheidegger – Eawag

and Outlook

v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling

θt+1=θt−λ ∇ L(θt)Online parameter adaption

L(θ)Other objective function?

Deep learning may be a hype – but it's a useful tool nevertheless!

Include other inputs

Interested in collaboration? [email protected]

Predict prediction uncertainty

Conclusions

Andreas Scheidegger – Eawag

Andreas Scheidegger – Eawag

Training and Implementaion

θnew=θold−λ ∇ L(θold)

Loss function

Scheduled Learning

Regularisation

Drop-out

L(θ)=∫0

RMS (τ ;θ) p(τ)d τ Runs on cuda compatible GPUs

Based on Python library chainer

Srivastava, et al. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1929–1958.

Stochastic Gradient Decent

Implementation

Bengio, et al. (2015). Scheduled sampling for sequence prediction with recurrent neural networks. arXiv preprint arXiv:1506.03099.

Tokui, et al., (2015). Chainer: a Next-Generation Open Source Framework for Deep Learning.

Andreas Scheidegger – Eawag

Training

Pt

Ht Ht+1

Pt+1

ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+2

training predictions

observed images predicted images

Andreas Scheidegger – Eawag

Scheduled Training

Pt

Ht Ht+1

Pt+1ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+1^

or

Pt+2

Pt+2^

or

training predictions

observed or predicted images predicted images

Bengio, Set al. (2015). Scheduled sampling for sequence prediction with recurrent neural networks. arXiv preprint arXiv:1506.03099.

Andreas Scheidegger – Eawag

Results

Andreas Scheidegger – Eawag

Prediction animated

Andreas Scheidegger – Eawag

Prediction animated

Andreas Scheidegger – Eawag

Traditional model structures

Estimate velocity field and translate last radar image (optical flow)

Identify rain cells, track and extrapolate them

→ seems to be the most “natural” approach

→ can model grows and decay of cells

● Models can be difficult to tune● Local features (e.g. mountain range) cannot be modeled

Convolution layers

https:// developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html

Convolution layers

https:// developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html https://en.wikipedia.org/wiki/Kernel_(im

age_processing)

Spatial Transformer

Jaderberg, M., Simonyan, K., Zisserman, A., and Kavukcuoglu, K. (2015) Spatial Transformer Networks. arXiv:1506.02025 [cs].