(dl hacks輪読) variational dropout and the local reparameterization trick

35
Variational Dropout and the Local Reparameterization Trick Diederik P.Kingma, Tim Salimans and Max Welling 発表者 雅

Upload: masahiro-suzuki

Post on 19-Jan-2017

746 views

Category:

Technology


1 download

TRANSCRIPT

  • Variational Dropout and the Local Reparameterization TrickDiederik P.Kingma, Tim Salimans and Max Welling

  • Submitted on 8 Jun 2015(arXiv) 7/17

    Dropout = local reparameterization trick

    (SGVB) EM

  • EM

    (SGVB)

    Variational Dropout and the Local Reparameterization Trick

  • EM

    (SGVB)

    Variational Dropout and the Local Reparameterization Trick

  • EM EM

    q(z)

    q(z) q(z)

  • EM

    (SGVB)

    Variational Dropout and the Local Reparameterization Trick

  • z p(z)p(x|z)

    z

  • p(x) p(x)=p(z)p(z|x)dz

    p(z|x) p(z|x)=p(x|z)p(z)/p(x) EM

  • q(x|z) p(x|z)

  • reparameterization trick

    reparameterization trick

    L(,;x) =

    q(z|x) logp(x, z)

    q(z|x)dz

    =

    q(z|x) log

    p(z)p(x|z)q(z|x)

    dz

    =

    q(z|x) log

    p(z)

    q(z|x)dz+

    q(z|x) log p(x|z)dz

    =

    q(z|x) logq(z|x)p(z)

    dz+

    q(z|x) log p(x|z)dz

    = DKL(q(z|x)||p(z)) + Eq(z|x)[log p(x|z)] (6)

    2 (SGVB),. , Eq(z|x)[f(z)],.

    1. q(z|x) {z(l)}Ll=1 .2. z(l) ,.

    Eq(z|x)[f(z)] 1

    L

    L

    l=1

    f(z(l)) (7)

    ,z q(z|x), g(,x) (z = g(,x)). ,p()., (7).

    Eq(z|x,)[f(z)] =

    q(z|x,)f(z)dz

    =

    p()f(z)d ( q(z|x,)dz = p()d)

    =

    p()f(g(,x))d

    = Ep()[f(g(,x))] 1

    L

    L

    l=1

    f(g((l),x))

    (l) p() (8)

    (5), LA(q, ;x).

    LA(q, ;x) = 1L

    L

    l=1

    log p(x, z(l)|) log q(z(l)|x,)

    z(l) = g((l),x), (l) p() (9)

    , (SGVB)., (6) SGVB. (6) KL,., (6)

    2

    L(,;x) =

    q(z|x) logp(x, z)

    q(z|x)dz

    =

    q(z|x) log

    p(z)p(x|z)q(z|x)

    dz

    =

    q(z|x) log

    p(z)

    q(z|x)dz+

    q(z|x) log p(x|z)dz

    =

    q(z|x) logq(z|x)p(z)

    dz+

    q(z|x) log p(x|z)dz

    = DKL(q(z|x)||p(z)) + Eq(z|x)[log p(x|z)] (6)

    2 (SGVB),. , Eq(z|x)[f(z)],.

    1. q(z|x) {z(l)}Ll=1 .2. z(l) ,.

    Eq(z|x)[f(z)] 1

    L

    L

    l=1

    f(z(l)) (7)

    ,z q(z|x), g(,x) (z = g(,x)). ,p()., (7).

    Eq(z|x,)[f(z)] =

    q(z|x,)f(z)dz

    =

    p()f(z)d ( q(z|x,)dz = p()d)

    =

    p()f(g(,x))d

    = Ep()[f(g(,x))] 1

    L

    L

    l=1

    f(g((l),x))

    (l) p() (8)

    (5), LA(q, ;x).

    LA(q, ;x) = 1L

    L

    l=1

    log p(x, z(l)|) log q(z(l)|x,)

    z(l) = g((l),x), (l) p() (9)

    , (SGVB)., (6) SGVB. (6) KL,., (6)

    2

  • (SGVB)

    (SGVB)

    1reconstruction error2

  • (SGVB) N

    MSGVB

    (M=100)L1

    1

    SGD

  • (SGVB)

    1. M 2. 3. 4. 5.

  • SGVB

    zx reparamaterization trick

  • SGVB

    Hinton

    MCMC1

    Deep Learning

  • EM

    (SGVB)

    Variational Dropout and the Local Reparameterization Trick

  • KL

  • SGVB SGVB

  • SGD SGD

    M

  • local reparameteraization trick

    f()

    0

    local reparameterization trick

  • reparameteraization trick

    0

    10001000M

    local reparameteraization trick

    B1000

    1000 = A

    1000

    M W

    1000

    1000

  • local reparameteraization trick

    B1000

    M = A

    1000

    M W

    1000

    1000

    B

    localreparameteraization trick 0M1000

    local

  • 01

    p

  • independent weight noise

    N(1,)b

    Wang and Manning (2013) B

    B=AWWlocal reparameterizaiton trick

  • correlated weight noise

    B

    local reparameterizaiton trick

    W

  • dropout posterior Dropout

    KL

    scale invariant log-uniform prior

  • 1

  • standard binary dropout Gaussian dropout type A (A) Gaussian dropout type B (B) variational dropout type A variational dropout type B

    MNIST

    fully connected3 rectified linear units(ReLUs) dropout rate: input layer p=0.2, hidden layers p=0.5 early stopping

  • variational dropout type B

    dropout

    dropout

  • SGVB

    local reparameterizationSGVBepoch SGVB1635sec SGVB7.4sec

    local reparameterizaiton200

  • A2KL

  • local reparameterization trick globallocal

    local reparameterization trick variational dropout