iibmp2016 深層生成モデルによる表現学習
TRANSCRIPT
-
Preferred Networks
2016/9/29IIBMP2016
-
l
l 2012 201420151500*
l
l
2
201422GoogLeNet [Google 2014]
*http://memkite.com/deep-learning-bibliography/
-
3
l xh Wi
x1
x2
x3
+1
w1w2w3w4
h = a(x1w1+x2w2+x3w3+w4)
h
a ReLU: h(x) = max(0, x)a :
ReLU
-
4
l y
x1
x2
x3
+1
+1 +1
y
-
5
https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/
-
20122014
6
AlexNet, Kryzyevsky+, 2012 ImageNet winner8
GoogLeNet, Szegedy+, 2014
-
2015
x_1 h y_1
x_2 h y_2
x_3 h y_3
x_4 h y_4
BPTTlength=3
Inputword OutputRecurrentstate
Stochastic Residual Net, Huang+, 2016Recurrent NN
FractalNet, Larsson+, 2016RoR, Zhang+, 2016 7Dense CNN, Huang+, 2016
-
(1/4)l l l(y, y*)= (y y*)2
l I{wi}l
x1
x2
x3
+1
+1 +1
ly
y*
-
(2/4)l : A1D
l C1D16/12 dD / dC = 16/12 (CdD16/12d
l B1dC / dB = 8/16dD/dB = (16/12)(8/16)=8/12
l dD/dA = (dD/dC)*(dC/dB)*(dB/dA) =10/12 : 10/12
AB C
10
816
D
12
-
(3/4)
l l
x1
x2
x3
+1
r s
ly
y*
@l
@y@l
@y
@y
@s
yl
sl
wl
w
@l
@w=
@l
@y
@y
@s
@s
@w
=r
-
(4/4)l L() v = L()/ L0 -vL()
l t+1 := t vt >0 AdamRMSProp
-v1
1
2
-
l
l
l 1000
l
-
Lin [Lin+ 16]l
l 1.
242.
3.
4.
13/50
-
(1/2)
l xzl
z
x
z, , (, [10, 2, -4], white
x
z
x
P(z|x)
-
2/2
l 1 c
l
l CG
15/50
z1
x
c
h
z2
h
-
l
z :
c
x
l xz
l
16/50
z
c
h
h
x
-
l
l x x xz
17/50
-
l NP NumberParameter SNPs
l
l
18/50
-
19/50
-
1. PCAl PCA
z N(0, I)zN(0, I)
z m(z) = Wz + m(z)x
x | z N(Wz + , 2 I)p(x) = p(x|z) p(z) PCA20
l PCA
20/50
z
x
-
2 ICAl ICA
z Lap() x | z N(Wz + , 2 I)
p(x) = p(x|z) p(z) dz
l ICAzW, u xk-zi [Vinnikov+ 14]
21/50
z
x
-
l p(z) = p(zi) Fisher
l p(x) = p(x|z)p(z) dz
l z Disentanglement
22/50
-
VAE2
784
1
-
l xz end-to-end
24/50
SVM
End-to-End
-
l
25/50
x
11
x
Linzx
-
26/50
l
-
27/50
l
-
l VAE GAN
l
-
l P(x)x MCMC
l P(x) P(x) P(x)L(x)
l x
-
x
P(x)
VAE
GAN
Q(x)/P(x)
Pixel CNNWaveNet
-
VAE [Kingma+ 13]
z
(, ) = Dec(z; )xN(, )
x
(, ) = Dec(z;
x
(1) z N(0, I)(2) (, ) = Dec(z; (3) x N(, I)
p(x) = p(x|z)p(z)dz
-
VAE
l p(x|z)p(z)p(x) = p(x|z)p(z)dz
-
VAE
ELBO: Evidence lowerbound
q(z|x)p(z|x)
q(z|x)KL(q(z|x) || p(z|x))
=
-
VAE (1/3)
z
(, ) = Dec(z; )xN(, I)
x
z
x
(, ) = Enc(x; )zN(, I)
q(z|x)xz
()()
-
VA3 (2/3)
x'
z
x
xzzxxx
KL(q(z|x)||p(z))
-
VAE (3/3)
((x-)/)2
x'
z
x
-
VAE (3/3)
x'
z
x
-
VAE
z
-
VAE http://vdumoulin.github.io/morphing_faces/
l 29 x : z : 29
40
-
VAE[Kingma+ 14]
41
x'
z
x
y
y09
z
y
-
VAE
l
l [Burda+ 15] [Maaloe+ 16]
l
-
(1/2)
l [Kingma+14]l MNIST, 09
l 1003000 10010
l 8.10% 3.33%M1+M2)
-
(2/2)
l l ADGM[Maaloe+16]1000.96% SVM (RBF)500001.4%
-
VAE
l
l
-
GANGenerative Adversarial Net[Goodfellow+14]
l l Generator
Discriminator
l Discriminator Generator
Generator
Discriminator
?
1/2
-
GAN
z
x = G(z)
x
x
(1) z U(0, I)(2) x = G(z
p(z)Gaussian
U
-
GAN
l D(x) 1, 0
l DG
p(z)G(z)dz=P(x), D(x)=1/2
z
x'
x = G(z)
{1(), 0)}
y = D(x)
x
-
GANhttp://www.inference.vc/an-alternative-update-rule-for-generative-adversarial-networks/
49
D/
-
GANhttps://github.com/mattya/chainer-DCGAN
50
-
2
51
-
1
52
-
53
-
GAN
l
l
-
GAN[Salimans+ 16]
l GAN 10000x/100%
20121
-
l p(x) = i p (xi|x1, x2, , xi-1)
Pixel RNN/CNN [Oord+16a] [Oord+16b], wavenet [Oord+16c]
-
[Kim+16]
l xE(x)pE(x) = exp(-E(x)) / N
l p(x)MCMCGANG(x) G(x)pE(x)G(x)
l p(x)
-
[Li+ 15]
l p(x)=q(x)(x)Ep(x)[(x)]=Eq(x)[(x)]l xi~p(x), xi~q(x)
l ((1/n)i(xi) - (1/m) i(xi))2 (x)
l GAN GANmin max
-
60
[Dahl+ 14]
-
Result:
l
l D
61
Community Learning
AUC values
0.93870.94130.92740.89130.9214
-
microRNAbindingDeep Target [Lee+ 2016]
62/50
RNA, miRNA
RNN
-
l
l
l
-
l [Lin+ 16] Why does deep and cheap learning work so well?, H. W. Lin, M. Tegmark
l [Vinnikov+ 14] K-means Recovers ICA Filters when Independent Components are Sparse, ICML 2014, A. Vinnikov, S. S.-Shwartz
l [Kingma+ 13] Auto-encoding Variational Bayes, D. P. Kingma, M. Welling
l [Kingma+ 14] Semi-supervised Learning with Deep Generative Models, D. P. Kingma, D. J. Rezende, S. Mohamed, M. Welling
l [Burda+ 15] Importance weighted autoencoders, Y. Burda, R. Grosse, R. Salakhutdinov
l [Maaloe+ 16] Auxiliary Deep Generative Models, L. Maaloe, c. K. Sonderby, S. K. Sonderby, O. Winther
l [Goodfellow+ 14] Gerative Adversarial Networks, I. J. Goodfellow and et. al.
-
l [Salimans+ 16] Improved Techniques for Training GANs, T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen
l [Oord+ 16a] Pixcel Reucurrent Neural Network, A. Oord. et al.
l [Oord+ 16b] Conditional Image Generation with PixelCNN Decoders, A. Oord et al.
l [Oord+ 16c] WaveNet: A Generative Model for Raw Audito, A. Oord et al.
l [Kim+ 16] Deep Directed Generative Models with Energy-based Probability estimation, T. Kim, Y. Bengio
l [Li+ 15] Generative Moment Matching Network, Y. Li, K. Swersky, R. Zemel
l [Dahl+ 14] Multi-task Neural Networks for QSAR Predictions, G. E. Dahl, N. Jaitly, R. alakhutdinov
l [Lee+ 16] DeepTarget: End-to-end Learning Framework for microRNA Target Prediction using Deep Recurrent Neural Networks, B. Leett, J. Baek, S. Park, S. Yoon
-
l Ql A
-
l l
1) 2)
-
l by
l
l
l p(x|z)p(z|x) p(z), p(x|z), p(x, z), p(x), p(z|x)
68/50