Prof Darren Wilkinson Professor of Stochastic Modelling
School of Mathematics & Statistics
Newcastle University

Gibbs sampling

This week we will look at how to construct a Gibbs sampler for sampling from full conditionals of multivariate distributions, and how to carry out a basic analysis of the output. First read carefully through the following example, trying it out as you go along, then tackle the exercises below.

Exercises

1.
Try simulating bivariate normal random vectors using the direct method and the Gibbs sampling approach. Try some different values of the correlation coefficient. Notice how the Gibbs sampler mixes more slowly the closer the correlation coefficient gets to one. Large sample sizes are required if the components are highly correlated.
2. (for postgrads)
Try out the C code, and make sure you understand how to link the C code into R under Linux.

Project 2

Implement in R a Gibbs sampler for sampling from the posterior distribution for the Normal model with unknown mean and precision and independent normal and gamma priors. We have seen in the lectures that the Gibbs sampler is the most straightforward way to sample from this posterior distribution. Use the full conditionals from your notes to implement the sampler. Initialise the sampler at the prior expectation for the mean and precision. Your main loop should take as arguments the number of samples required, the sufficient statistics for the data (the sample size, the mean, and sample variance) and all the hyper-parameters which specify the prior. Once you have the function working, you should also write a "wrapper" which takes a data vector as an argument instead of the sufficient statistics. The wrapper will calculate the required sufficient statistics, and then call the original function.

Suppose that your beliefs about the mean can be described by a N(10,100) distribution and your beliefs about the precision by a Gamma(3,11) distribution. Suppose further that you observe a sample of size 15 with mean 25 and sample variance 20. Obtain samples from the posterior, plot histograms of the posterior marginals, and obtain summary statistics for them. Remember that you can transform your samples for the posterior precision into samples for the posterior standard deviation by square-rooting the reciprocal of the precision.

Your report should be 4-6 sides of A4, and should give a coherent description of the sampler and its implementation, before going on to analyse the example problem. The hand-in deadline is 12pm on the last day of term, but remember that the computer clusters are very busy at the end of term.

Note that the old version of R parameterised the gamma distribution differently to us, but the new version doesn't! The functions in question are dgamma() and rgamma() - check the help pages for these functions and make sure you understand how they work. Also remember that R uses standard deviation rather than variance for the normal generator. And don't get mixed up between variances and precisions... :-)

Darren J Wilkinson Book: Stochastic Modelling for
	      Systems Biology  
darren.wilkinson@ncl.ac.uk
http://www.staff.ncl.ac.uk/d.j.wilkinson/