Download - Irs gan doc
![Page 1: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/1.jpg)
!"##$!%&"# '$%($$# ) *+)$#$,-%&.$ /.$,0-,&-1
*$%(",203 -#/ 456+4#.$,0$ 5$",!$8$#% 6$-,#] -#/
:#$,9;<'-0$/ ="/$1
) * 456
!"#$%&'()*
!"
!+ ,-. )',(
![Page 2: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/2.jpg)
!"#$"#%
!"# $%$&'()*$ !+*$&,)'- "$(./&0,1
!
234#2%*$&,$ 3$)%5/&6$7$%( 4$'&%)%81
! "
![Page 3: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/3.jpg)
!"# $%$&'()*$ !+*$&,)'- "$(./&0,1
!"!#$%&# '()*#(+("$%!#
!"!#$%&# '()*#(+("$%&#
'()*(#(+("$%&# !"!#$%&#
!"#$%& ' ()* !+,"%
! "#
![Page 4: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/4.jpg)
!"# $%&'(& !&)$*+',&-&$. "&/'$)$01
!"#$%&
!"#$%& ' ()* !+,"%
! "#
![Page 5: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/5.jpg)
!"
!
τ −cθ(τ)
pθ(τ) =1
Z(θ)exp (−cθ(τ))
cθ(τ) =∑
t
cθ(xt, ut)
τ =
(
x1, x2, · · · , xTu1, u2, · · · , uT
)
−cθ(τ) cθ(τ)xt t xut t uτ
! "#
![Page 6: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/6.jpg)
!"
!
τ −cθ(τ)cθ(τ)
pθ(τ) =1
Z(θ)exp (−cθ(τ))
!"#$%& '() (* '( ()
! "#
![Page 7: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/7.jpg)
!"
!"
!" #$%&'() −∫
pθ(log pθ)dpθ
*!+,-!$ .&'/0,
*+-10 2',% 30!&$-$4
*56 *+-1 2',% 30!&$-$4 *56
! "#
![Page 8: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/8.jpg)
!"#$ %&'( )$*+,",- .&+ /0)
!" #$%&'( Lcost(p) pθ )*%+&,-
!"#$ "% &'()
Lcost(p) = Eτ∼p[− log pθ(τ)] ./0
= Eτ∼p[cθ(τ)] + logZ(θ) .10
= Eτ∼p[cθ(τ)] + log
(
Eτ∼q
[
exp(−cθ(τ))
q(τ)
])
.20
!" #$%,&'( Z(θ)q q Lsampler(q)
! "#
![Page 9: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/9.jpg)
!"#$ %&'( )$*+,",- .&+ /0)
cθ Z =∫
exp(cθ(τ))dθq(τ) 1
Zexp(−cθ(τ)) !
Lsampler(q) q(τ) "#$%&'(
!"#$%&' () *+,-
Lsampler(q) = KL
(
q(τ)||1
Zexp(−cθ(τ))
)
)*+
=
∫
q(τ) log1
Zexp(−cθ(τ))
q(τ)dτ ),+
= Eτ∼p[cθ(τ)] + Eτ∼q[log q(τ)] + logZ )-+
./01# 2&3$ !#4'5056 p qLcost(p) pθLsampler(q) q
! "#
![Page 10: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/10.jpg)
!"#$ %&'( )$*+,",- .&+ /0)
q(τ) !"#$%&'() *&!"+,'- *&!"+)
p(τ)
µ ∼1
2p(τ) +
1
2q(τ)
p(τ) p̃(τ)./0 .)')$&%#$ p(τ)
!"#$ %
Lcost(p) = Eτ∼p[cθ(τ)] + log
(
Eτ∼µ
[
exp(−cθ(τ))1
2p̃(τ) + 1
2q(τ)
])
123
! " #!
![Page 11: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/11.jpg)
!" #$%&$'$()*+,
p(τ) q(τ) !"
#$%&'$($)*+,' D∗
!"# $%&'(%)%*+,-(.
D∗(τ) =p(τ)
1
2p(τ) + 1
2q(τ)
-./
p(τ)
p(τ) =1
Zexp(−cθ(τ))
!"# $%&'(%)%*+,-( /0( θ.
Dθ(τ) =1
Zexp(−cθ(τ))
1
2Zexp(−cθ(τ)) +
1
2q(τ)
-0/
! "#
![Page 12: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/12.jpg)
!" #$%&$'$()*+, -+%%
!"## "$ %&#'(&)*+,(-
Ldiscriminator(Dθ) = Eτ∼p[logDθ(τ)]− Eτ∼p[log(1−Dθ(τ))] !"#
= Eτ∼p
[
− log1
Zexp(−cθ(τ))
1
2Zexp(−cθ(τ)) +
1
2q(τ)
]
− Eτ∼p
[
− logq(τ)
1
2Zexp(−cθ(τ)) +
1
2q(τ)
]
!!#
$%&'(%)%*+,-( ./,0-(1 2-&&
! " !#
![Page 13: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/13.jpg)
!"#$%&"% '
µ̃ =1
2Zexp(−cθ(τ)) +
1
2q(τ)
!"#$%"&"'()*%+
Ldiscriminator(Dθ) = Eτ∼p[logDθ(τ)]− Eτ∼p[log(1−Dθ(τ))] !"#
= Eτ∼µ
[
1
Zexp(−cθ(τ))
µ̃
]
− Eτ∼q
[
− logq(τ)
µ̃
]
!$#
= logZ + Eτ∼p[cθ(τ)] + Eτ∼p[log µ̃(τ)]
− Eτ∼q[log q(τ)] + Eτ∼q[log µ̃(τ)] !%#
! " #$
![Page 14: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/14.jpg)
!"#$%&"% '
!"#
Ldiscriminator(Dθ) = logZ + Eτ∼p[cθ(τ)] + Eτ∼p[log µ̃(τ)]
− Eτ∼q[log q(τ)] + Eτ∼q[log µ̃(τ)]
Ldiscriminater(Dθ) Z Z
!"#$%&#$! '#()"#*#+%&!" ,#-& z
∂zLdiscriminator(Dθ) =1
Z− Eτ∼µ
[
1
Z2 exp(−cθ(τ))
µ̃
]
!$#
∂zLdiscriminator(Dθ) = 0 !%#
Z = Eτ∼µ
[
exp(−cθ(τ))
µ̃
]
!&#
! " #$
![Page 15: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/15.jpg)
!"#$%&#$! #'("#)#*%&!"
!"#
Ldiscriminator(Dθ) = logZ + Eτ∼p[cθ(τ)] + Eτ∼p[log µ̃(τ)]
− Eτ∼q[log q(τ)] + Eτ∼q[log µ̃(τ)]
Ldiscriminater(Dθ) θ
!"#$%&#$! '#()"#*#+%&!" ,#&- θ
∂θLdiscriminator(Dθ) = Eτ∼p[∂θcθ(τ)]
− Eτ∼µ
[
1
Zexp(−cθ(τ)∂θcθ(τ)
µ̃
]
!$#
! " #$
![Page 16: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/16.jpg)
!"#$%&#$! '() *+,&
!"
Lcost(θ) = Eτ∼p[cθ(τ)] + log
(
Eτ∼µ
[
exp(−cθ(τ))
µ̃(τ)
])
#$% Lcost(θ) θ &!" '
!"#$%&#$! '()& *#&+ θ
∂θLcost(θ) = Eτ∼p[∂θcθ(τ)] + ∂θ logEτ∼µ
[
exp(−cθ(τ))
µ̃(τ)
]
&("
= Eτ∼p[∂θcθ(τ)]
−
(
Eτ∼µ
[
exp(−cθ(τ))∂θcθ(τ)
µ̃(τ)
]
/Eτ∼µ
[
exp(−cθ(τ))
µ̃(τ)
])
= Eτ∼p[∂θcθ(τ)]−
(
Eτ∼µ
[
exp(−cθ(τ))∂θcθ(τ)
µ̃(τ)
]
/Z
)
)*"
)&"
! " #$
![Page 17: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/17.jpg)
!"#$%&'!" ()* #!&+ ,"- ./0 -''2'",+!1
!"#$%&#$! '() *+,& - !"#$%&#$! ./0 1#,*"#2#3%&+"
!"#
∂θLdiscriminator(Dθ) = Eτ∼p[∂θcθ(τ)]
− Eτ∼µ
[
1
Zexp(−cθ(τ)∂θcθ(τ)
µ̃
]
∂θLcost(θ) = Eτ∼p[∂θcθ(τ)]−
(
Eτ∼µ
[
exp(−cθ(τ))∂θcθ(τ)
µ̃(τ)
]
/Z
)
= Eτ∼p[∂θcθ(τ)]− Eτ∼µ
[
1
Zexp(−cθ(τ))∂θcθ(τ)
µ̃(τ)
]
$$#
= ∂θLdiscriminator(Dθ) $%#
! " #$
![Page 18: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/18.jpg)
!"#$%&'!" ()* &+,-$./ +"0 123 4."./+5!/
!" #$%&'()
!"
Lsampler(q) = Eτ∼p[cθ(τ)] + Eτ∼q[log q(τ)]
*+, -(.()$/0) 1 !2 #$%&'() 3 40.#/$./
Lgenerater(q) = Eτ∼q[log(1−D(τ))− logD((τ))] #$"
= Eτ∼q
[
logq(τ)
µ̃(τ)− log
1
Zexp(−cθ(τ))
µ̃(τ)
]
#%"
= Eτ∼q[log q(τ) + logZ + cθ(τ)] #!"
= logZ + Eτ∼q[cθ(τ)] + Eτ∼q[log q(τ)] #&"
= logZ + Lsampler(q) #'"
! " #$
![Page 19: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/19.jpg)
!"#$%&'!"
!"#$%&
!" Lcost qθ Lsampler q−cθ(τ)
#$% Lgenerater Ldiscriminator
q(τ) = p(τ)
'() *+,
!" #$% ∂θLcost = ∂θLdiscriminator
!" #$% Lsampler(q) + logZ = Lgenerator(q)
! " #$
![Page 20: Irs gan doc](https://reader033.vdocuments.pub/reader033/viewer/2022050812/5a65d07c7f8b9aaf638b4ab1/html5/thumbnails/20.jpg)
!"#"$%&
!"
!"#$%&
pθ(τ) =1
Z(θ)exp (−cθ(τ))
Z(θ) cθ ! " !