深層学習(岡本孝之 著)deep learning chap.4_2

26
深層学習 著:岡本 孝之 NAIST Computational Linguistic Lab D1 Masayoshi Kondo 4章 –後半

Upload: masayoshi-kondo

Post on 21-Jan-2018

102 views

Category:

Data & Analytics


12 download

TRANSCRIPT

  1. 1. : NAIST Computational Linguistic Lab D1 Masayoshi Kondo 4 -
  2. 2. 00: Deep Learning : 165 ()
  3. 3. XX: () . . .
  4. 4. 00: (CNN) (RNN)
  5. 5. 00: () - (Deep Learning) / ()
  6. 6. 00: (3) - (overtting) .
  7. 7. () () () 00: !!
  8. 8. 4 4.1 4.2 4.3 4.4 4.5 ()!! ()
  9. 9. 4 4.1 4.2 4.3 4.4 4.5
  10. 10. () ( or ) -: (13) !! Q : Ans: () ( or ) ()
  11. 11. DNN() E E w = E w1 !! E wM " # $ % & ' t w(t+1) = w(t) E W E w (RNNBPTT) E w~ [] []t h[]t+ [] W -: DNN W E
  12. 12. 1. z(1)=xnlu(l)z(l). 2. j (L). 3. l(=L-1, L-2, L-3,,4 ,3, 2)j (L) . 4. l(=2, 3, 4,,L-2 ,L-1 ,L)wji (l) . j (l) = k (l+1) wkj (l+1) !f (uj (l) )( ) k En wji (l) =j (l) zi (l1) j (l) En uj (l) # $ %% & ' (( -: 1 (back propagation) () ()
  13. 13. -: () E = En n . E wji (l) = En wij (l) n
  14. 14. 4 4.1 4.2 4.3 4.4 4.5
  15. 15. 01: 4.4.1 . . [ ] [ ] En = 1 2 yj dj( ) j 2 yj = zj (L) = uj (L) j (L) = yj dj En = d log y +(1 d)log(1 y) y = 1 1+exp(u) (L) = d y
  16. 16. 02: 4.4.1 ( ) En = dk log yk k j (L) = yj dj yk = exp(uk (L) ) exp(ui (L) ) i [ ]
  17. 17. 03: 4.4.2 X = x1,!, xN[ ] b Z(l) = z1 (l) ,!, zN (l)!" #$ D = d1,!,dN[ ] U(l) = u1 (l) ,!,uN (l)!" #$ W Y = y1,!, yN[ ] : : : xnl un (j) : un (l) zn l : i1wij(j, i) : :
  18. 18. 04: 4.4.2 U(l) = W(l) Z(l1) + b(l) 1N t . Z(l) = f (l) (U(l) ) U(1) X l =1,2,!, L
  19. 19. 05: 4.4.2 (l) (l) = !f (l) (U(l) ) W(l+1)!" #$ t (l+1) ( ) : l j (l) (l,n=1,2,,N) (L) = D Y () - f(u) f(u) () () !f (u) = 1 (u 0) 0 (u < 0) # $ % & % f (u) = tanh(u) f (u) = max(u,0) f (u) = 1 1+eu !f (u) =1 tanh2 (u) !f (u) = f (u)(1 f (u)) 1. (l)
  20. 20. 06: 4.4.2 (l) 2. W(l) = 1 N (l) Z(l1)$% &' t W(l) b(l) : wji (l)(j,i) : bj (l)j b(l) = 1 N (l) 1N t W(l) = W(l) b(l) = b(l) , l1,2,,L . b(l) W(l) . ( / , ) W(l) = W(l ") (W(l) + W(l) ) b(l) = b(l ") b(l) [ ] W(l ") b(l ")
  21. 21. 07: 4.4.2 3. () W(l) W(l) + W(l) b(l) b(l) + b(l)
  22. 22. E(dierence approximation) . . . E wji (l) = E(!,wij (l) +,!) E(!,wij (l) ,!) 08: 4.4.2 E(w) wji (l) E(!,wij (l) +,!) w . wji (l)
  23. 23. E(dierence approximation) 0 wji .(..) =0 0 09: 4.4.2 .
  24. 24. 4 4.1 4.2 4.3 4.4 4.5
  25. 25. (). . (l) = !f (l) (U(l) ) W(l+1)!" #$ t (l+1) ( ) 10: (Vanishing gradient problem) . (pretraining)
  26. 26.