deep learning for time series datasail.unist.ac.kr/talks/prml_winter_2017_deep_learning... ·...

Deep Learning for

Time Series Data

UNIST

School of ECE

Jaesik Choi

http://sail.unist.ac.kr/tutorials

http://sail.unist.ac.kr/tutorials

AlphaGo (https://deepmind.com/alpha-go)

알파고와이세돌기사의대국에서알파고의 4 대 1 승리 (2016년 3월)

바둑(알파고)

What is the future of AI?

Artificial Intelligence

Automation of Knowledge Work

Automation of Knowledge Work

SOURCE: https://public.tableau.com/profile/mckinsey.analytics#!/vizhome/AutomationBySector/WhereMachinesCanReplaceHumans

https://public.tableau.com/profile/mckinsey.analytics#!/vizhome/AutomationBySector/WhereMachinesCanReplaceHumans

Robot Task (Manipulation) Learning(Sergey Levine, 2015)

Financial Time Series Analysis

관계형자동통계학자(UNIST)The Relational Automatic Statistician

다수의 시계열 데이터의 변화 및 상호 관계를 분석하는 인공지능 시스템

+ +

+

주식데이터

다중주식데이터베이지안다중커널학습

질의

자동보고서작성/변화예측

주식간관계정보분석

UNIST 시스템: MIT/캠브리지 분석 시스템 대비 예측 오류 40% 감소 (2016년 6월)

학습

What is Deep Learning?

Artificial Intelligence

Perceptron

Nonlinear Transform

Linear Separable Classes in Multilayer Perceptron

After a nonlinear transformation, red and blue are linear separable*

* Y. LeCun, Y. Bengio, G. Hinton (2015). Deep Learning. Nature 521, 436-444.

LeNet 5 (1989) vs GoogLeNets (2014)

LeNet5: Recognizing digits using a neural network with 5 layers

Recognizing human faces

What are Recurrent Neural Networks?

Recurrent Neural Network (RNN)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Recurrent Network - Circuit Graph vs Unfold Computation Graph[Goodfellow et al., 2016] http://www.deeplearningbook.org

x: input

h: hidden unit

: parameter - parameter sharing (or parameter tying) over time t

time delay

Recurrent Neural Networks[Goodfellow et al., 2016] http://www.deeplearningbook.org

x: input

h: hidden unit

o: output

L: loss function

y: true label

Parameter Learning: Back-propagation through time (BPTT)

Recurrent Hidden Units[Goodfellow et al., 2016] http://www.deeplearningbook.org

Remark: the recurrent neural network is universal in the sense that any function

computable by a Turing machine can be computed by such a recurrent network of a

finite size. (RNNs are Turing complete)

The output can be read from the RNN after a number of time steps that is

asymptotically linear in the number of time steps used by the Turing machine and in the

length of the input (Siegelmann and Sontag, 1991; Hyotyniemi, 1996)

A Turing machine = A finite state machine + An external tape

RNN with single ouput[Goodfellow et al., 2016] http://www.deeplearningbook.org

Recurrence through only the Output[Goodfellow et al., 2016] http://www.deeplearningbook.org

This RNN is not Turing complete

unless output ot is expressive enough to include all the information in ht.

Recurrence through only the output: Teacher Forcing[Goodfellow et al., 2016] http://www.deeplearningbook.org

However, this can be trained parallel, without using the BPTT algorithm.

Vector to Sequence[Goodfellow et al., 2016] http://www.deeplearningbook.org

E.g., Image Caption

E.g., Image

Backpropagation in RNN[Goodfellow et al., 2016] http://www.deeplearningbook.org

Gradient of loss (L) at o(t) for ith output

Gradient of loss (L) at h() when is the last time step

Gradient of loss (L) at h(t)

Derivative of tanh()Gradient from future Gradient from current output

Bidirectional RNNs[Goodfellow et al., 2016] http://www.deeplearningbook.org

Backward

Forward

Recurrent Neural Network (RNN): A Case Study

* Courtesy of Prof. Seungchul Lee (UNIST, http://isystems.unist.ac.kr/)

http://isystems.unist.ac.kr/

Recurrent Neural Network (RNN) Long/short-term dependencies

RNNs can learn a short-term dependencies.

It is very hard to learn a long-term dependencies from a RNN model.

Long Short Term Memory (LSTM)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Cell (Long-term)

Output (Short-term) Forget gate

Input gate

Long Short Term Memory (LSTM)

Long Short Term Memory (LSTM): A Case Study

What are the Current/Future Issues?

Deep RNNs[Goodfellow et al., 2016] http://www.deeplearningbook.org

(a) Deep network for output

(b) Deep network for transition

(c) Deep network for transition with skip

Deep RNNs over Transition[Pascanu et al., 2014]

Challenges of Long-Term Dependencies[Goodfellow et al., 2016] http://www.deeplearningbook.org

RNN in a time step

RNN after t time steps

When W admits an eigendecomposition of the form

RNN after t time steps

Any component of h(0) not aligned with the largest eigenvector will be discarded eventually.

Residual Network (ResNet, He et. al., 2015)

Residual learning

Comparison of Resnet

3.6% of error in ImageNet Challenge, 2015

Densely Connected Convolutional Networks(DensNet, Huang et. Al., 2016)

Better Performance than ResNet

CIFAR10 (3.74% -ResNet 4.62%)

CIFAR 100 (19.25% - ResNet 22.71%)

Recurrent Convolutional Neural Layers (RCNN, Liang and Hu, 2015)

* Figure is drawn by SubinYi

Recurrent Convolutional Layer (RCL)

RCNN on EEG Analysis

One chunk: Data: 3584,32

Hand Start

First Digit Touch

Lift off

Replace

Both Released

* Joint work with Azamatbek Akhmedov


Applying RCL Convolutional Layer:(1,3584)

Max pooling

Max pooling

Max pooling

Max pooling

Max pooling

Fully Connected

RCL:(1,896)

RCL:(1,224)

RCL:(1,56)

RCL:(1,14)

(1,7)

(6)

97.687%


256 1x9 filters


Example: Hand Start


Example: First Digit Touch


Example: Replace

Complex Hybrid System - Manufacturing

Deep Learning

Temperature prediction

400 Time Series Data

Thank [email protected]

deep learning for time series datasail.unist.ac.kr/talks/prml_winter_2017_deep_learning... ·...

Documents