deep learning for time series datasail.unist.ac.kr/talks/prml_winter_2017_deep_learning... ·...
TRANSCRIPT
Deep Learning for
Time Series Data
UNIST
School of ECE
Jaesik Choi
http://sail.unist.ac.kr/tutorials
AlphaGo (https://deepmind.com/alpha-go)
알파고와이세돌기사의대국에서알파고의 4 대 1 승리 (2016년 3월)
바둑(알파고)
What is the future of AI?
Artificial Intelligence
Automation of Knowledge Work
Automation of Knowledge Work
SOURCE: https://public.tableau.com/profile/mckinsey.analytics#!/vizhome/AutomationBySector/WhereMachinesCanReplaceHumans
Robot Task (Manipulation) Learning(Sergey Levine, 2015)
Financial Time Series Analysis
관계형자동통계학자(UNIST)The Relational Automatic Statistician
다수의 시계열 데이터의 변화 및 상호 관계를 분석하는 인공지능 시스템
+ +
+
주식데이터
다중주식데이터베이지안다중커널학습
질의
자동보고서작성/변화예측
주식간관계정보분석
UNIST 시스템: MIT/캠브리지 분석 시스템 대비 예측 오류 40% 감소 (2016년 6월)
학습
What is Deep Learning?
Artificial Intelligence
Perceptron
Nonlinear Transform
Linear Separable Classes in Multilayer Perceptron
After a nonlinear transformation, red and blue are linear separable*
* Y. LeCun, Y. Bengio, G. Hinton (2015). Deep Learning. Nature 521, 436-444.
LeNet 5 (1989) vs GoogLeNets (2014)
LeNet5: Recognizing digits using a neural network with 5 layers
Recognizing human faces
What are Recurrent Neural Networks?
Recurrent Neural Network (RNN)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Recurrent Network - Circuit Graph vs Unfold Computation Graph[Goodfellow et al., 2016] http://www.deeplearningbook.org
x: input
h: hidden unit
: parameter - parameter sharing (or parameter tying) over time t
time delay
Recurrent Neural Networks[Goodfellow et al., 2016] http://www.deeplearningbook.org
x: input
h: hidden unit
o: output
L: loss function
y: true label
Parameter Learning: Back-propagation through time (BPTT)
Recurrent Hidden Units[Goodfellow et al., 2016] http://www.deeplearningbook.org
Remark: the recurrent neural network is universal in the sense that any function
computable by a Turing machine can be computed by such a recurrent network of a
finite size. (RNNs are Turing complete)
The output can be read from the RNN after a number of time steps that is
asymptotically linear in the number of time steps used by the Turing machine and in the
length of the input (Siegelmann and Sontag, 1991; Hyotyniemi, 1996)
A Turing machine = A finite state machine + An external tape
RNN with single ouput[Goodfellow et al., 2016] http://www.deeplearningbook.org
Recurrence through only the Output[Goodfellow et al., 2016] http://www.deeplearningbook.org
This RNN is not Turing complete
unless output ot is expressive enough to include all the information in ht.
Recurrence through only the output: Teacher Forcing[Goodfellow et al., 2016] http://www.deeplearningbook.org
However, this can be trained parallel, without using the BPTT algorithm.
Vector to Sequence[Goodfellow et al., 2016] http://www.deeplearningbook.org
E.g., Image Caption
E.g., Image
Backpropagation in RNN[Goodfellow et al., 2016] http://www.deeplearningbook.org
Gradient of loss (L) at o(t) for ith output
Gradient of loss (L) at h() when is the last time step
Gradient of loss (L) at h(t)
Derivative of tanh()Gradient from future Gradient from current output
Bidirectional RNNs[Goodfellow et al., 2016] http://www.deeplearningbook.org
Backward
Forward
Recurrent Neural Network (RNN): A Case Study
* Courtesy of Prof. Seungchul Lee (UNIST, http://isystems.unist.ac.kr/)
Recurrent Neural Network (RNN) Long/short-term dependencies
RNNs can learn a short-term dependencies.
It is very hard to learn a long-term dependencies from a RNN model.
Long Short Term Memory (LSTM)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Cell (Long-term)
Output (Short-term) Forget gate
Input gate
Long Short Term Memory (LSTM)
Long Short Term Memory (LSTM)
Long Short Term Memory (LSTM): A Case Study
What are the Current/Future Issues?
Deep RNNs[Goodfellow et al., 2016] http://www.deeplearningbook.org
(a) Deep network for output
(b) Deep network for transition
(c) Deep network for transition with skip
Deep RNNs over Transition[Pascanu et al., 2014]
Challenges of Long-Term Dependencies[Goodfellow et al., 2016] http://www.deeplearningbook.org
RNN in a time step
RNN after t time steps
When W admits an eigendecomposition of the form
RNN after t time steps
Any component of h(0) not aligned with the largest eigenvector will be discarded eventually.
Residual Network (ResNet, He et. al., 2015)
Residual learning
Comparison of Resnet
3.6% of error in ImageNet Challenge, 2015
Densely Connected Convolutional Networks(DensNet, Huang et. Al., 2016)
Better Performance than ResNet
CIFAR10 (3.74% -ResNet 4.62%)
CIFAR 100 (19.25% - ResNet 22.71%)
Recurrent Convolutional Neural Layers (RCNN, Liang and Hu, 2015)
* Figure is drawn by SubinYi
Recurrent Convolutional Layer (RCL)
RCNN on EEG Analysis
One chunk: Data: 3584,32
Hand Start
First Digit Touch
Lift off
Replace
Both Released
* Joint work with Azamatbek Akhmedov
RCNN on EEG Analysis
Applying RCL Convolutional Layer:(1,3584)
Max pooling
Max pooling
Max pooling
Max pooling
Max pooling
Fully Connected
RCL:(1,896)
RCL:(1,224)
RCL:(1,56)
RCL:(1,14)
(1,7)
(6)
97.687%
RCNN on EEG Analysis
256 1x9 filters
RCNN on EEG Analysis
Example: Hand Start
RCNN on EEG Analysis
Example: Hand Start
RCNN on EEG Analysis
Example: Hand Start
RCNN on EEG Analysis
Example: Hand Start
RCNN on EEG Analysis
Example: First Digit Touch
RCNN on EEG Analysis
Example: First Digit Touch
RCNN on EEG Analysis
Example: First Digit Touch
RCNN on EEG Analysis
Example: First Digit Touch
RCNN on EEG Analysis
Example: Replace
RCNN on EEG Analysis
Example: Replace
RCNN on EEG Analysis
Example: Replace
RCNN on EEG Analysis
Example: Replace
Complex Hybrid System - Manufacturing
Deep Learning
Temperature prediction
400 Time Series Data
Thank [email protected]