Birth-death process

  • Single species and two reactions \[ X \longrightarrow 2X \quad\text{and}\quad 2X \longrightarrow X \]
  • Propensity functions \(\lambda X\) and \(\mu X\)

Deterministic representation

The deterministic model is \[ \frac{dX(t)}{dt} = (\lambda -\mu) X(t) \] which can be solved to give \(X(t) = x_0 \exp[ (\lambda-\mu) t]\).

  • Also known as the macroscopic rate equation
  • Mean equation from the linear noise approximation

Stochastic representation

  • In the stochastic framework, each reaction has a probability of occurring
  • The analogous version of the birth-death process is the difference equation

\[ \frac{dp_x}{dt} = \lambda (x-1)p_{x-1} + \mu (x+1)p_{x+1} - (\lambda + \mu) x p_x \]

  • Usually called the forward Kolmogorov equation or chemical master equation

plot of chunk unnamed-chunk-1

Moment equations

  • Multiplying the CME by \(e^{n\theta}\) and summing over \(x\) gives \[ \frac{\partial M(\theta;t)}{\partial t} = [\lambda (e^{\theta} - 1) + \mu (e^{-\theta} -1 )] \frac{\partial M(\theta;t)}{\partial \theta} \] where \[ M(\theta;t ) = \sum_{x=0}^{\infty} e^{x\theta} p_x(t) \]
  • Differentiating this pde w.r.t \(\theta\) and setting \(\theta=0\) yields \[ \frac{d E[X(t)]}{dt} = (\lambda - \mu)E[X(t)] \] where \(E[X(t)]\) is the stochastic mean

The mean equation

\[ \frac{dE[X(t)]}{dt} = (\lambda - \mu)E[X(t)] \]

  • This ODE is solvable - the associated CME is also tractable
    • When we can't solve the CME, the associated mean ODE is also intractable
  • The equation for the mean and deterministic ODE are identical
    • When the rate laws are linear, the stochastic mean and deterministic solution always correspond

The variance equation

  • If we differentiate the pde w.r.t \(\theta\) twice and set \(\theta=0\), we get \[ \frac{dE[X(t)^2]}{dt} = (\lambda + \mu)E[X(t)] + 2(\lambda - \mu)E[X(t)^2] \] and hence the variance \(Var[X(t)] = E[X(t)^2] - E[X(t)]^2\)
  • Differentiating three times gives an expression for the skewness, etc

Dimerisation system

\[ 2X_1 \longrightarrow X_2 \quad \mbox{and} \quad X_2 \longrightarrow 2 X_1 \] with propensities \(h_1(X_1, X_2) = 0.5 k_1 X_1(X_1-1)\) and \(h_2(X_2) = k_2 X_2\)

plot of chunk unnamed-chunk-2

Dimerisation moment equations

We formulate the dimer model in terms of moment equations \[ \begin{align*} \frac{dE[X_1]}{dt} &= 0.5 k_1 (E[X_1^2] - E[X_1]) -k_2 E[X_1] \\ \frac{dE[X_1^2]}{dt} &= k_1 (E[X_1^2X_2] - E[X_1 X_2]) + 0.5 k_1(E[X_1^2] - E[X_1]) \\ &\quad+k_2 (E[X_1] - 2 E[X_1^2]) \end{align*} \]

where \(E[X_1]\) is the mean of \(X_1\) and \(E[X_1^2] - E[X_1]^2\) is the variance

The \(i^{{\small\text{th}}}\) moment equation depends on the \((i+1)^{{\small \text{th}}}\) equation

Deterministic approximates stochastic

Rewriting \[ \frac{dE[X_1]}{dt} = 0.5 k_1 ( E[X_1^2] - E[X_1]) -k_2 E[X_1] \] in terms of its variance, i.e. \(E[X_1^2] = Var[X_1] + E[X_1]^2\), we get \[ \frac{dE[X_1]}{dt} = 0.5 k_1 E[X_1](E[X_1]-1) + 0.5 k_1 Var[X_1]-k_2 E[X_1] \]

  • Setting \(Var[X_1] = 0\), recovers the deterministic equation
  • The deterministic model approximates the stochastic
  • When we have polynomial rate laws, setting the variance to zero results in the deterministic equation

Moment Closure

Moment closure

  • Typically use Gaussian or Log-normal, since it's easy to generalise to the multivariate case
Distribution Moments Cumulants
Gaussian \(\mu_3 = 3 \mu_2 \mu_1 - 2 \mu_1^3\) \(\kappa_3=0\)
Poisson \(\mu_3 = \mu_1+3\mu_1^2 + \mu_1^3\) \(\kappa_3=\kappa_2=\kappa_1\)
Log-Normal \(\mu_3 = (\mu_2/\mu_1)^3\) \(\kappa_3=(\kappa_2/\kappa_1)^3 + 3\kappa_2^2/\kappa_1\)

Rational rate laws

\[ X \rightarrow \emptyset \] with propensity function \[ h(X) = \frac{k_1 X}{k_2 + X} \] The associated forward Kolmogorov equation is \[ \frac{dp_x(t)}{dt} = p_{x+1}(t)\frac{k_1(x+1)}{k_2 +x +1} - p_x(t)\frac{k_1x}{k_2 +x}, \] where \(p_x(t)\) is the probability of observing \(x\) individuals at time \(t\)

Moment equations

To find the associated moment equations of we multiply by \((k_2+x+1)(k_2+x)\) to give \[ \frac{dp_x(t)}{dt}(k_2+n+1)(k_2+n) = p_{x+1}(t)k_1(n+1)(k_2+n) - p_x(t) k_1 x (k+x+1). \] which yields the equation for the (stochastic) mean \[ \frac{dE(X)}{dt}=-\frac{k_1 E(X)}{k_2+1/2+E(X)}-\frac{dVar(X)}{dt} \]

plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3

Rate laws involving powers

If we consider a slightly more complicated propensity function \[ h(X) = \frac{k_1 X^2}{k_2 + X^2} \] we get \[ \frac{dE(X)}{ dt} =\frac{1}{2k_2}\left[-4k_1 E(X^3) - (2k_2+1)\frac{d E(X^2)}{dt} - 2\frac{d E(X^3)}{dt} - \frac{d E(X^4)}{dt}\right]. \]

Approximation levels

  1. All terms for the means, variances and covariance in the moment equations are included
  2. Assume the variance and covariance are zero in the moment equations for the means
  3. Assume all differentials on the RHS are zero

General moment expansion

Suppose we have \(N\) chemical species \(\{X_1, \ldots, X_N\}\) and \(L\) reactions \(\{R_1, \ldots, R_L\}\). Reaction \(R_l\) corresponds to \[ \underline s_{l1} X_1 + \ldots + \underline s_{lN} X_N \overset{k_l}{\longrightarrow} \overline s_{l1} X_1 + \ldots + \overline s_{lN} X_n, \] where \(\mathbf{ \underline{s}_{l}}\) and \(\mathbf{ \overline s_{l}}\) are the number of reactants and the products in each species involved in reaction \(l\). The associated moment pde is \[ \frac{\partial M(\theta; t)}{\partial t} = \sum_{{\bf n}={\bf 0}}^{\infty} {\bf \theta}^{{\bf n}} \sum_{l=1}^L \sum_{{\bf i}} a_{l, {\bf i}} \sum_{{\bf k}={\bf 0}}^{{\bf n}} {\bf s_l}^{\bf k} {\bf n \choose \bf k} \mu_{{\bf n} - {\bf k} + {\bf i}}(t) \] where \(k_1+\ldots + k_N \ne 0\), \[ {\bf s_l}^{\bf k} {\bf n \choose \bf k} = s_{l1}^{k_1} {n_1 \choose k_1}\times \ldots \times s_{lN}^{k_N} {n_N \choose k_N} \] and \[ \mu_{{\bf n} - {\bf k} + {\bf j}}(t) = \mu_{n_1 - k_1 +j_1, \ldots, n_N- k_N+j_N}(t) \]

Single species

In particular, when there is only a single species in the population, we get \[ \frac{\partial M(t)}{\partial t} = \sum_{n=1}^{\infty} \theta_1^{n} \sum_{l=1}^L \sum_{i=0}^{\infty} a_{l,i} \sum_{k=1}^{n} s_{l,1}^{k} {n \choose k} \mu_{n-k+i}(t) \;. \] So the equation for the first moment is \[ \frac{d\mu_1(t)}{dt} = \sum_{l=1}^L \sum_{i=0}^{\infty} a_{l,i} \;s_{l,1} \;\mu_{i}(t) \] whilst for the \(n^{\text{th}}\) moment we have \[ \frac{d\mu_{n}(t)}{dt} = \sum_{l=1}^L \sum_{i=0}^{\infty} a_{l, i} \sum_{k=1}^{n} {n \choose k} \;s_{l,i}^{k}\; \mu_{n-k+i}(t) \;. \]

Heat shock model (Proctor et al, 2005)

  • Stochastic kinetic model of the heat shock system
    • twenty-three reactions
    • seventeen chemical species
  • A single stochastic simulation up to \(t=2000\) takes about \(35\) minutes.
  • If we convert the model to moment equations, we get \(139\) equations

Heat shock model: results

plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4
plot of chunk unnamed-chunk-5

Why use MC/LN approximations?

  • Model building
    • a quick method for estimating the mean/variance of system
  • Steady-state analysis
    • set the differentials to zero
  • Parameter inference
    • instead of multiple forward simulations (per MCMC iteration), solve a set of ODEs - can be orders of magnitude faster
    • Lotka-Volterra model - \(10^4\) iterations in less that 20 seconds


  • When modelling biological systems, it's unusual to have detailed knowledge about all rate constants
    • Vague parameters over many orders of magnitude
    • Uncertain initial conditions
    • Unobserved species
  • How can we assess the appropriateness of the MC/LN approximation?

P53-Mdm2 oscillation model

  • Proctor and Grey, 2008
    • 16 chemical species
    • Around a dozen reactions
  • The model contains a single event
    • At \(t = 1\), set \(X = 0\)
  • If we convert the model to moment equations, we get 139 equations.

SIR model

  • Immigration: \(\emptyset \longrightarrow X + 1\) and \(\emptyset \xrightarrow{k_2 sf} Y + 1\)
  • Death: \(X \xrightarrow{k_3 sf} \emptyset\) and \(Y \longrightarrow \emptyset\)
  • Interaction: \(X +Y \longrightarrow 2 Y\)
plot of chunk unnamed-chunk-6

Assessing the moment closure approximations

  • Accuracy of the mean and variance estimates
    • For any system with a second-order reaction, we can prove that the MC/LN approximations are not exact
  • Suitability of the normality assumption
    • Continuous distribution for a discrete state space
  • The question is, are the approximations good enough?

Assessing moment accuracy

  • If the system contains only zero and first order reactions, the MC/LN moment estimates are exact
  • Intuitively, if the MC/LN produce different moment estimates, this suggests an issue with the approximation
    • Note - if the MC/LN approximations agree, this does not imply accuracy
  • For a particular set of parameters, \(\gamma_i = (c_1, \ldots, c_L)\), calculate the standardised difference \[ D_{j}^{LM}(\mathbf{\gamma}_i) = \frac{\psi_j^l(\mathbf{\gamma}_i) - \psi_j^m(\mathbf{\gamma}_i)}{\sqrt{\Sigma_{jj}^l(\mathbf{\gamma}_i)}} \] where
    • \(\psi_j(\mathbf{\gamma}_i)\) is the solution to the equation for species \(j\) for parameters \(\gamma_i\) using the LN or MC approximation
    • \(\Sigma_{jj} (\mathbf{\gamma}_i)\) is the variance estimate
  • Large values of \(D_j^{LM}\) indicate a potential issues with the approximation

Efficiently explore the parameter space

  • Typically have vague priors on parameters

Lotka Volterra system

  • Simple system with non-linear dynamics \[ X \rightarrow X + 1, \quad X + Y \rightarrow 2Y \quad \text{and} \quad Y \rightarrow \emptyset \]
  • Under certain parameter combinations, deterministic model predicts sustained oscillations
    • But the stochastic system can become extinct
  • Set-up:
    • Fix \(c_3=0.3\) and \(X(0) = Y(0)=100\)
    • Maximum simulation time \(t=30\)
  • Assess the moment closure and the linear noise approximations over the parameter ranges \[ c_1 \sim (10^{-6}, 10^0) \quad \text{and} \quad c_2 \sim (10^{-6}, 10^0) \] on the log scale

LV moment assessment

  • Form a Latin hypercube on the log scale
  • At each point, \(\gamma_i\) on the hyper cube calculate the standardised difference \(D_{j}^{LM}(\mathbf{\gamma}_i)\)
    • Values of \(D_{j}^{LM}(\mathbf{\gamma}_i) > 5\) are shown as red triangles
plot of chunk unnamed-chunk-7 plot of chunk unnamed-chunk-8

Assessing the normality

  • We can borrow ideas from Gaussian process assessment
  • At each point, \(\gamma_i\), on the design space
    • estimate the mean, \(\psi\), and co-variance, \(\Sigma\), using the LN/MC approximations
    • simulate using an exact simulator
  • Prediction error for species \(j\),
    \[ e_{i,j} = y_j^*(\mathbf{\gamma}_i) - \psi_j^{l}(\mathbf{\gamma}_i) \]
  • Easier to work with the standardised prediction errors \[ e_{i,j}^* = \frac{y_j^*(\mathbf{\gamma}_i) - \psi_j^l(\mathbf{\gamma}_i)}{\sqrt{\Sigma_{jj}^l(\mathbf{\gamma}_i)}} \;. \]
  • If the assumptions are correct, then \(e_{i,j}^* \sim N(0, 1)\)

Interval diagnostics

  • Construct \(100\alpha\)% confidence interval using the approximate mean \(\psi\) and variance, \(\Sigma\)
  • Denote a particular confidence region at design point \(i\), for species \(j\), as \(CI_{i,j} (\alpha)\)
  • The proportion of simulated values that land within the confidence region is given by \[ D_j^{CR} = \frac{1}{n_d} \sum_{i=1}^{n_d} \mathbf{1}[y_j^*(\mathbf{\gamma}_i) \in CI_{i,j}(\alpha)], \] where \(\mathbf{1}[\cdot]\) is the indicator function. The value of \(D_{j}^{CR}\) should be approximately equal to \(\alpha\)

Schlogl model

  • A well known test system that exhibits bi-modal and non-linear characteristics at certain parameter combinations
  • The system contains four reactions \[ 2X_1 + X_2 \xrightarrow{\phantom{a}c_{1}\phantom{a}} 3X_1, \quad 3X_1 \xrightarrow{\phantom{a}c_{2}\phantom{a}} 2X_1 + X_2, \quad X_3 \xrightarrow{\phantom{a}c_{3}\phantom{a}} X_1 \\ X_1 \xrightarrow{\phantom{a}c_{4}\phantom{a}} X_3 \] where the distribution of \(X_1\) is bi-modal, the Gaussian assumption would clearly be inappropriate
  • Parameters \(\{c_1, c_2\}\) are fixed at \(\{3\times 10^{-7}, 10^{-4}\}\) and the initial conditions are held constant at \[ \{X_1(0), X_2(0), X_3(0)\} = \{250, 10^5, 2\times 10^5\} \;. \]
  • Two dimensional hypercube size \(n_d=1,000\) on the \(\log_{10}\) space for the parameters \(\{c_3, c_4\}\)
  • At each of the \(n_d\) points on the hypercube, we simulate using the direct method and the LNA to a maximum time of \(t=5\).

Schlogl model: diagnostics

  • Latin hyper-cube
  • Standardised errors
  • Q-Q plot
  • \(X_1\) vs \(e_i^*\)

Schlogl model: simulations

  • Investigate large standardised errors more carefully


  • Moment closure/linear approximations provide a quick, easy to construct approximation to the system of interest
    • Intelligent emulator
  • Care needs to be taken to assess the suitability of the approximation


  • Gillespie, CS. Moment closure approximations for mass-action models. IET Systems Biology 2009
  • Milner, P, Gillespie, CS and Wilkinson, DJ. Moment closure approximations for stochastic kinetic models with rational rate laws. Mathematical Biosciences 2011
  • Gillespie, C.S. Diagnostics for assessing the accuracy of approximate stochastic simulators, submitted