To develop and test effective parallel computing strategies for
computationally-intensive Bayesian inference for very large
lattice Markov spatio-temporal models.
Objectives
An 8 CPU Linux Beowulf cluster will be
built, and used to test parallel algorithms for Bayesian
computation. Different algorithms will be developed and tested for
efficiency in terms of performance increase per processor, taking into
account computational overheads associated with process
synchronisation, scheduling, and inter-process communication. Once
efficient strategies have been identified, these will be used both for
on-going spatio-temporal model research on the 8 CPU cluster, and also
for evidence supporting a future research council grant proposal for a
large parallel-computing facility for Bayesian computation in large
highly-structured stochastic systems.
Description
Bayesian inference for very large highly-structured models is usually
carried out using Markov Chain Monte Carlo (MCMC) techniques. For very
large models with many latent (unobserved) variables, the performance
of standard MCMC algorithms based on updating each latent variable in
turn is very poor. One strategy for improving performance is to use
blocking or local computation techniques for updating many variables
simultaneously. Such techniques have been used successfully by the
applicant in Boys, Henderson and Wilkinson (2000), Goldstein and
Wilkinson (2000), and Wilkinson and Yeung (2002, 2003). Such
techniques are very effective for improving the performance of the
MCMC scheme, but have a large computational overhead. One particularly
effective scheme for blocking explored in Wilkinson
and Yeung (2003) is based on the use of sparse matrix
techniques. These techniques have been implemented in a (sequential)
software libarary, GDAGsim.
The performance of this library could be dramatically improved if
parallel sparse direct solution algorithms such as provided by the PSPASES
library were used. Alternatively, updating schemes based on many
blocks can speeded up if the updating scheme is chosen carefully, and
conditionally independent blocks are updated in parallel using a
message-passing scheme based on the Message-Passing Interface
(MPI). The relative efficiency of different schemes will depend on the
precise topology of the underlying conditional independence structure
for the model, and a major goal of the project is to identify
parallelisation techniques that are particularly effective in the
context of lattice Markov spatio-temporal models.
1/7/02: Testing of parallel strategies for lattice-Markov
spatio-temporal models well under way.
20/8/02: Some code (gsl-sprng) is
released for easing the development of parallel stochastic simulation
codes.
31/1/03: Completed chapter on Parallel Bayesian computation
for the Handbook of Parallel Computing and Statistics (see my publication
list for full citation details). This publication details many of the
research findings facilitated by this grant. It includes a case study
examining parallelisation of an MCMC algorithm for fitting stochastic
volatility models. The source code is freely available.