recurrent neuronal network tailored for weather radar nowcasting

Eawag: Swiss Federal Institute of Aquatic Science and Technology

Recurrent Neuronal Network tailored forWeather Radar Nowcasting

June 23, 2016

Andreas Scheidegger

Andreas Scheidegger – Eawag

Weather Radar Nowcasting

t0t

-1t-2t

-3t-4

t5t

4t3t

2t1

nowcastmodel

available inputs:every pixel of the previous images

desired outputs:every pixel of the next n images


Recurrent ANN

Pt

Ht Ht+1

ANN

Pt+1^

last observedimage


Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2

ANN

Pt+1^ Pt+2

^

last observedimage


Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2 Ht+3

Pt+3^

ANN ANN

Pt+1^ Pt+2

^

last observedimage


Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+1^ Pt+2

^

last observedimage


Model structure

Pt

Ht Ht+1

ANN

Pt+1^

v j=σ(∑k

w jkuk+b j)

Traditional Artificial Neuronal Networks (ANN) are (nested) non-linear regressions:


How can we do better?


Model structure

Pt

Ht Ht+1

ANN

Pt+1^

v j=σ(∑k

w jkuk+b j)



How can we do better?

tailor model structure to problem


Model structure

Pt

Ht Ht+1

ANN

Pt+1^


Model structure

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected

spatialtransformer deconvolution

^

Gaussian blur

fullyconnected


Spatial transformer

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected


^

Gaussian blur

fullyconnected

Jad

erb

erg

, M

., S

imo

nya

n,

K.,

Zis

serm

an

, A.,

an

d K

avu

kcu

oglu

, K

. (2

01

5)

Sp

atia

l Tra

nsf

orm

er

Ne

two

rks.

arX

iv:1

50

6.0

20

25

[cs

].


Spatial Transformer

input output

Jaderberg, M., Simonyan, K., Zisserman, A., and Kavukcuoglu, K. (2015) Spatial Transformer Networks. arXiv:1506.02025 [cs].


Spatial transformer

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected


^

Gaussian blur

fullyconnected


Local correction

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected


^

Gaussian blur

fullyconnected


Local correction


Gaussian blur

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected


^

Gaussian blur

fullyconnected


Recurrent ANN

Pt

Ht Ht+1

ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+1^ Pt+2

^

last observedimage


Model structure

convolution

Pt

Ht

Pt+1

Ht+1

+

fullyconnected

fullyconnected

fullyconnected


^

Gaussian blur

fullyconnected


PredictionForecast horizon: 2.5 minutes

observed predicted


Prediction

observed predicted

Forecast horizon: 15 minutes


Prediction

observed predicted



Prediction


Conclusions


Deep learning may be a hype – but it's a useful tool nevertheless!

Conclusions


v j=σ(∑

k

w jku k+b j)

Combine domain knowledge with data driven modeling


Conclusions


and Outlook

v j=σ(∑

k

w jku k+b j)



Conclusions


and Outlook

v j=σ(∑

k

w jku k+b j)


θt+1=θt−λ ∇ L(θt)Online parameter adaptionDeep learning may be a

hype – but it's a useful tool nevertheless!

Conclusions


and Outlook

v j=σ(∑

k

w jku k+b j)


θt+1=θt−λ ∇ L(θt)Online parameter adaption

L(θ)Other objective function?


Conclusions


and Outlook

v j=σ(∑

k

w jku k+b j)





Include other inputs

Conclusions


and Outlook

v j=σ(∑

k

w jku k+b j)






Predict prediction uncertainty

Conclusions


and Outlook

v j=σ(∑

k

w jku k+b j)






Interested in collaboration? [email protected]

Predict prediction uncertainty

Conclusions


Training and Implementaion

θnew=θold−λ ∇ L(θold)

Loss function

Scheduled Learning

Regularisation

Drop-out

L(θ)=∫0

∞

RMS (τ ;θ) p(τ)d τ Runs on cuda compatible GPUs

Based on Python library chainer

Srivastava, et al. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1929–1958.

Stochastic Gradient Decent

Implementation

Bengio, et al. (2015). Scheduled sampling for sequence prediction with recurrent neural networks. arXiv preprint arXiv:1506.03099.

Tokui, et al., (2015). Chainer: a Next-Generation Open Source Framework for Deep Learning.


Training

Pt

Ht Ht+1

Pt+1

ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+2

training predictions

observed images predicted images


Scheduled Training

Pt

Ht Ht+1

Pt+1ANN

Ht+2 Ht+3 Ht+4

Pt+3^ Pt+4

^

ANN ANN ANN

Pt+1^

or

Pt+2

Pt+2^

or

training predictions

observed or predicted images predicted images

Bengio, Set al. (2015). Scheduled sampling for sequence prediction with recurrent neural networks. arXiv preprint arXiv:1506.03099.


Results


Prediction animated


Traditional model structures

Estimate velocity field and translate last radar image (optical flow)

Identify rain cells, track and extrapolate them

→ seems to be the most “natural” approach

→ can model grows and decay of cells

● Models can be difficult to tune● Local features (e.g. mountain range) cannot be modeled

Convolution layers

https:// developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html

Convolution layers

https:// developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html https://en.wikipedia.org/wiki/Kernel_(im

age_processing)

Spatial Transformer

Jaderberg, M., Simonyan, K., Zisserman, A., and Kavukcuoglu, K. (2015) Spatial Transformer Networks. arXiv:1506.02025 [cs].