moment closure based parameter inference of stochastic kinetic models
Post on 07-Jul-2015
744 Views
Preview:
DESCRIPTION
TRANSCRIPT
Moment Closure Based ParameterInference of Stochastic Kinetic Models
Colin Gillespie
School of Mathematics & Statistics
Overview
Talk outlineI An introduction to moment closure
I Parameter inference
I Conclusion
2/25
Birth-death process
Birth-death model
X −→ 2X and 2X −→ X
which has the propensity functions λX and µX .
Deterministic representationThe deterministic model is
dX (t)dt
= (λ− µ)X (t) ,
which can be solved to give X (t) = X (0) exp[(λ− µ)t ].
3/25
Birth-death process
Birth-death model
X −→ 2X and 2X −→ X
which has the propensity functions λX and µX .
Deterministic representationThe deterministic model is
dX (t)dt
= (λ− µ)X (t) ,
which can be solved to give X (t) = X (0) exp[(λ− µ)t ].
3/25
Stochastic representation
I In the stochastic framework, eachreaction has a probability of occurring
I The analogous version of thebirth-death process is the differenceequation
dpn
dt= λ(n− 1)pn−1 + µ(n + 1)pn+1
− (λ + µ)npn
Usually called the forward Kolmogorovequation or chemical master equation
0
10
20
30
40
50
0 1 2 3 4Time
Pop
ulat
ion
4/25
Moment equations
I Multiply the CME by enθ and sum over n, to obtain
∂M∂t
= [λ(eθ − 1) + µ(e−θ − 1)]∂M∂θ
where
M(θ; t) =∞
∑n=0
enθpn(t)
I If we differentiate this p.d.e. w.r.t θ and set θ = 0, we get
dE[N(t)]dt
= (λ− µ)E[N(t)]
where E[N(t)] is the mean
5/25
The mean equation
dE[N(t)]dt
= (λ− µ)E[N(t)]
I This ODE is solvable - the associated forward Kolmogorov equation isalso solvable
I The equation for the mean and deterministic ODE are identical
I When the rate laws are linear, the stochastic mean and deterministicsolution always correspond
6/25
The variance equation
I If we differentiate the p.d.e. w.r.t θ twice and set θ = 0, we get:
dE[N(t)2]
dt= (λ− µ)E[N(t)] + 2(λ− µ)E[N(t)2]
and hence the variance Var[N(t)] = E[N(t)2]− E[N(t)]2.
I Differentiating three times gives an expression for the skewness, etc
7/25
Simple dimerisation model
Dimerisation
2X1 −→ X2 and X2 −→ 2X1
with propensities 0.5k1X1(X1 − 1) and k2X2.
8/25
Dimerisation moment equations
I We formulate the dimer model in terms of moment equations
dE[X1]
dt= 0.5k1(E[X
21 ]− E[X1])− k2E[X1]
dE[X 21 ]
dt= k1(E[X
21 X2]− E[X1X2]) + 0.5k1(E[X
21 ]− E[X1])
+ k2(E[X1]− 2E[X 21 ])
where E[X1] is the mean of X1 and E[X 21 ]− E[X1]2 is the variance
I The i th moment equation depends on the (i + 1)th equation
9/25
Deterministic approximates stochastic
RewritingdE[X1]
dt= 0.5k1(E[X
21 ]− E[X1])− k2E[X1]
in terms of its variance, i.e. E[X 21 ] = Var[X1] + E[X1]2, we get
dE[X1]
dt= 0.5k1E [X1](E[X1]− 1) + 0.5k1Var[X1]− k2E[X1] (1)
I Setting Var[X1] = 0 in (1), recovers the deterministic equation
I So we can consider the deterministic model as an approximation tothe stochastic
I When we have polynomial rate laws, setting the variance to zeroresults in the deterministic equation
10/25
Deterministic approximates stochastic
RewritingdE[X1]
dt= 0.5k1(E[X
21 ]− E[X1])− k2E[X1]
in terms of its variance, i.e. E[X 21 ] = Var[X1] + E[X1]2, we get
dE[X1]
dt= 0.5k1E [X1](E[X1]− 1) + 0.5k1Var[X1]− k2E[X1] (1)
I Setting Var[X1] = 0 in (1), recovers the deterministic equation
I So we can consider the deterministic model as an approximation tothe stochastic
I When we have polynomial rate laws, setting the variance to zeroresults in the deterministic equation
10/25
Simple dimerisation model
I To close the equations, we assume an underlying distribution
I The easiest option is to assume an underlying Normal distribution, i.e.
E[X 31 ] = 3E[X 2
1 ]E[X1]− 2E[X1]3
I But we could also use, the Poisson
E[X 31 ] = E[X1] + 3E[X1]
2 + E[X1]3
or the Log normal
E[X 31 ] =
(E[X 2
1 ]
E[X1]
)3
11/25
Simple dimerisation model
I To close the equations, we assume an underlying distribution
I The easiest option is to assume an underlying Normal distribution, i.e.
E[X 31 ] = 3E[X 2
1 ]E[X1]− 2E[X1]3
I But we could also use, the Poisson
E[X 31 ] = E[X1] + 3E[X1]
2 + E[X1]3
or the Log normal
E[X 31 ] =
(E[X 2
1 ]
E[X1]
)3
11/25
Heat shock modelI Proctor et al, 2005. Stochastic kinetic model of the heat shock system
I twenty-three reactionsI seventeen chemical species
I A single stochastic simulation up to t = 2000 takes about 35 minutes.
I If we convert the model to moment equations, we get 139 equationsADP Native Protein
0
200
400
600
800
1000
1200
5700000
5750000
5800000
5850000
5900000
5950000
6000000
0 500 1000 1500 2000 0 500 1000 1500 2000Time
Pop
ulat
ion
Gillespie, CS, 2009
12/25
Density plots: heat shock model
Time t=200 Time t=2000
0.000
0.002
0.004
0.006
600 800 1000 1200 1400 600 800 1000 1200 1400ADP population
Den
sity
13/25
P53-Mdm2 oscillation model
I Proctor and Grey, 2008I 16 chemical speciesI Around a dozen reactions
I The model contains an eventI At t = 1, set X = 0
I If we convert the model to momentequations, we get 139 equations.
I However, in this case the momentclosure approximation doesn’t do towell!
0
50
100
150
200
250
300
0 5 10 15 20 25 30Time
Pop
ulat
ion
14/25
P53-Mdm2 oscillation model
I Proctor and Grey, 2008I 16 chemical speciesI Around a dozen reactions
I The model contains an eventI At t = 1, set X = 0
I If we convert the model to momentequations, we get 139 equations.
I However, in this case the momentclosure approximation doesn’t do towell!
0
50
100
150
200
250
300
0 5 10 15 20 25 30Time
Pop
ulat
ion
14/25
P53-Mdm2 oscillation model
I Proctor and Grey, 2008I 16 chemical speciesI Around a dozen reactions
I The model contains an eventI At t = 1, set X = 0
I If we convert the model to momentequations, we get 139 equations.
I However, in this case the momentclosure approximation doesn’t do towell!
0
50
100
150
200
250
300
0 5 10 15 20 25 30Time
Pop
ulat
ion
14/25
What went wrong?
I The moment closure (tends) to fail when there is a large differencebetween the deterministic and stochastic formulations
I In this particular case, strongly correlated species
I Typically when the MC approximation fails, it gives a negativevariance
I The MC approximation does work well for other parameter values forthe p53 model
15/25
Parameter inference
0
1
2
3
4
0 10 20 30 40 50Time
Pop
ulat
ion
0
2
4
6
8
10
0 2 4 6 8 10k1
k 2
I Simple immigration-deathprocess
I R1 : ∅k1−→ X
I R2 : Xk2−→ ∅
I The CME can be solved
I Discrete time course data
I The likelihood can be very flat
16/25
Parameter inference
0
1
2
3
4
● ●
● ● ●
● ●
●
●
●
●
0 10 20 30 40 50Time
Pop
ulat
ion
0
2
4
6
8
10
0 2 4 6 8 10k1
k 2
I Simple immigration-deathprocess
I R1 : ∅k1−→ X
I R2 : Xk2−→ ∅
I The CME can be solved
I Discrete time course data
I The likelihood can be very flat
16/25
Parameter inference
0
1
2
3
4
● ●
● ● ●
● ●
●
●
●
●
0 10 20 30 40 50Time
Pop
ulat
ion
0
2
4
6
8
10
0 2 4 6 8 10k1
k 2
I Simple immigration-deathprocess
I R1 : ∅k1−→ X
I R2 : Xk2−→ ∅
I The CME can be solved
I Discrete time course data
I The likelihood can be very flat
16/25
Lotka-Volterra model
The Lotka-Volterra predator prey system,describes the time evolution of twospecies, Y1 and Y2
I Prey birth: Y1 → 2Y1I Interaction: Y1 + Y2 → 2Y2I Predator death: Y2 → ∅I Since the Lotka-Volterra model
contains a non-linear rate law, the i th
moment equation depends on the(i + 1)th moment.
0
100
200
300
400
0 10 20 30 40
Time
Pop
ulat
ion
Species Predator Prey
17/25
Lotka-Volterra model
The Lotka-Volterra predator prey system,describes the time evolution of twospecies, Y1 and Y2
I Prey birth: Y1 → 2Y1I Interaction: Y1 + Y2 → 2Y2I Predator death: Y2 → ∅I Since the Lotka-Volterra model
contains a non-linear rate law, the i th
moment equation depends on the(i + 1)th moment.
0
100
200
300
400
0 10 20 30 40
Time
Pop
ulat
ion
Species Predator Prey
17/25
Lotka-Volterra model
The Lotka-Volterra predator prey system,describes the time evolution of twospecies, Y1 and Y2
I Prey birth: Y1 → 2Y1I Interaction: Y1 + Y2 → 2Y2I Predator death: Y2 → ∅I Since the Lotka-Volterra model
contains a non-linear rate law, the i th
moment equation depends on the(i + 1)th moment.
0
100
200
300
400
0 10 20 30 40
Time
Pop
ulat
ion
Species Predator Prey
17/25
Lotka-Volterra model
The Lotka-Volterra predator prey system,describes the time evolution of twospecies, Y1 and Y2
I Prey birth: Y1 → 2Y1I Interaction: Y1 + Y2 → 2Y2I Predator death: Y2 → ∅I Since the Lotka-Volterra model
contains a non-linear rate law, the i th
moment equation depends on the(i + 1)th moment.
0
100
200
300
400
0 10 20 30 40
Time
Pop
ulat
ion
Species Predator Prey
17/25
Parameter estimation
I Let Y(tu) = (Y1(tu),Y2(tu))′ be the vector of the observed predatorand prey
I To infer c1, c2 and c3, we need to estimate
Pr[Y(tu)| Y(tu−1), c]
i.e. the solution of the forward Kolmogorov equation
I We will use moment closure to estimate this distribution:
Y(tu) |Y(tu−1), c ∼ N(ψu−1,Σu−1)
where ψu−1 and Σu−1 are calculated using the moment closureapproximation
18/25
Parameter estimation
I Let Y(tu) = (Y1(tu),Y2(tu))′ be the vector of the observed predatorand prey
I To infer c1, c2 and c3, we need to estimate
Pr[Y(tu)| Y(tu−1), c]
i.e. the solution of the forward Kolmogorov equation
I We will use moment closure to estimate this distribution:
Y(tu) |Y(tu−1), c ∼ N(ψu−1,Σu−1)
where ψu−1 and Σu−1 are calculated using the moment closureapproximation
18/25
Bayesian parameter inference
I Summarising our beliefs about c and the unobserved predatorpopulation Y2(0) via uninformative priors
I The joint posterior for parameters and unobserved states (for a singledata set) is
p (y2, c | y1) ∝ p(c) p (y2(0))40
∏u=1
p (y(tu) | y(tu−1), c)
I For the results shown, we used a vanilla Metropolis-Hasting step toexplore the parameter and state spaces
I For more complicated models, we can use a Durham & Gallant stylebridge (Milner, G & Wilkinson, 2012)
19/25
Bayesian parameter inference
I Summarising our beliefs about c and the unobserved predatorpopulation Y2(0) via uninformative priors
I The joint posterior for parameters and unobserved states (for a singledata set) is
p (y2, c | y1) ∝ p(c) p (y2(0))40
∏u=1
p (y(tu) | y(tu−1), c)
I For the results shown, we used a vanilla Metropolis-Hasting step toexplore the parameter and state spaces
I For more complicated models, we can use a Durham & Gallant stylebridge (Milner, G & Wilkinson, 2012)
19/25
Resultsc1 c2 c3
Mom. Clos.
Diffusion
Exact
Mom. Clos.
Diffusion
Exact
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Fully O
bs.P
artially Obs.
0.3 0.4 0.5 0.6 0.7 0.8 0.0015 0.0020 0.0025 0.0030 0.0035 0.2 0.3 0.4Parameter value
20/25
Auto regulation system
I This system contains twelve reactions and six species
I The species populations ranges from zero (for species i) to around65,000 for species G
I The moment closure approximation yields a closed set oftwenty-seven ODEs
I Six ODEs for the meansI Six ODEs for the variancesI Fifteen ODEs for the covariance terms
21/25
Stochastic realisation
0
5
10
15
20
25
30
0 10 20 30 40 50
Time
Pop
ulat
ion
Species
g
i
r_g
r_i
0
5
10
15
0 10 20 30 40 50
Time
I
65000
65050
65100
0 10 20 30 40 50
Time
G
22/25
Stochastic realisation
0
5
10
15
20
25
30
0 10 20 30 40 50
Time
Pop
ulat
ion
Species
g
i
r_g
r_i
0
5
10
15
0 10 20 30 40 50
Time
I
65000
65050
65100
0 10 20 30 40 50
Time
G
22/25
Stochastic realisation
0
5
10
15
20
25
30
0 10 20 30 40 50
Time
Pop
ulat
ion
Species
g
i
r_g
r_i
0
5
10
15
0 10 20 30 40 50
Time
I
65000
65050
65100
0 10 20 30 40 50
Time
G
22/25
Parameter inferenceFully Obs. Partially Obs.
c8
c7
c6
c5
c4
c3
c2
c1
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0Parameter value
Par
amet
er
I Posterior distributions for c1 toc8: mean ± 2 sd. True values inred
I Given information on allspecies, inference is reasonable
I For most of the parameters,fewer data points results inlarger credible regions
I But not in all cases!
23/25
Parameter inferenceFully Obs. Partially Obs.
c8
c7
c6
c5
c4
c3
c2
c1
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0Parameter value
Par
amet
er
I Posterior distributions for c1 toc8: mean ± 2 sd. True values inred
I Given information on allspecies, inference is reasonable
I For most of the parameters,fewer data points results inlarger credible regions
I But not in all cases!
23/25
Parameter inferenceFully Obs. Partially Obs.
c8
c7
c6
c5
c4
c3
c2
c1
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0Parameter value
Par
amet
er
I Posterior distributions for c1 toc8: mean ± 2 sd. True values inred
I Given information on allspecies, inference is reasonable
I For most of the parameters,fewer data points results inlarger credible regions
I But not in all cases!
23/25
Parameter inferenceFully Obs. Partially Obs.
c8
c7
c6
c5
c4
c3
c2
c1
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0Parameter value
Par
amet
er
I Posterior distributions for c1 toc8: mean ± 2 sd. True values inred
I Given information on allspecies, inference is reasonable
I For most of the parameters,fewer data points results inlarger credible regions
I But not in all cases!
23/25
Future work
I Techniques for assessing the moment closure approximationI Better closure techniques
I Computer emulation for momentsI Using the moment closure approximation as a proposal distribution in
an MCMC algorithmI The proposal can be (almost) anything we wantI The likelihood can be calculated using anything we want
24/25
Acknowledgements
I Peter Milner I Darren Wilkinson
ReferencesI Gillespie, CS Moment closure approximations for mass-action models. IET Systems Biology 2009.
I Gillespie, CS, Golightly, A Bayesian inference for generalized stochastic population growth models with application to aphids.Journal of the Royal Statistical Society, Series C 2010.
I Milner, P, Gillespie, CS, Wilkinson, DJ Moment closure approximations for stochastic kinetic models with rational rate laws.Mathematical Biosciences 2011.
I Milner, P, Gillespie, CS and Wilkinson, DJ Moment closure based parameter inference of stochastic kinetic models.Statistics and Computing 2012.
25/25
top related