Prof Darren Wilkinson Professor of Stochastic Modelling
School of Mathematics & Statistics
Newcastle University

Independence sampling and transformation methods

This week we will look at writing software to simulate random variables from standard distributions. First read carefully through the following two examples, trying them out as you go along, then tackle the exercises below.

Exercises

1.
Download the R source code for generation of exponential random varaibles, and test for yourself the difference in execution times for the different implementations.
2.
Download and test the R code for generation of Normal random variables. Write your own non-vectorised version of the Box-Muller algorithm in R using an explicit "for" loop for the generation of a vector of standard Normal random variables. Note how much slower it is than the vectorised version.
3.
Write a "wrapper" function which generates vectors of Normal random variables with arbitrary mean and variance. A "wrapper" function is just a very short function which calls another function in order to do most of its work. So, your function should just call your standard normal generator to get a vector of standard normal variates, and then re-scale the elements appropriately to give a vector with the correct mean and variance.
4.
If X(1), X(2), ..., X(p) are independent standard Normal random variables, then Z=X(1)*X(1)+X(2)*X(2)+ .... + X(p)*X(p) is a Chi-squared random variable with p degrees of freedom. Write some R code to generate vectors of Chi-squared random variables. See the hints below!
5.
The Cauchy distribution has density function f(x)=1/(Pi*(1+x*x)). Use the transformation method to generate variates from this distribution. Hint: in R and C, the tan function is tan(). Also note that the Cauchy distribution is a long-tailed distribution, so you would expect to get a few very large or small values in your sample. The effect of this is that if you do a histogram of a sample of Cauchy variables, you often get a big spike in the middle. One way around this is to just do a histogram of the values which aren't too extreme. So, if your sample is in x, rather than doing, hist(x,30), do something like hist(x[abs(x)<10],30).
6. (for postgrads)
See if you can download, compile, and run the C code for generation of exponential and Normal random variables (best done on a Unix box). Verify that it offers little in the way of performance gains over the vectorised R code. Extend the C code to generate Chi-squared and Cauchy random variables.

Hints and Tips

Read the hints and tips page!

For the Chi-squared question, the following may be useful!

m<-matrix(rnorm(n*p),ncol=p)   # creates a matrix with p columns
mat<-m*m                       # "*" is element-wise product
mat %*% rep(1,p)               # "%*%" is matrix multiplication
An alternative to the last line is
apply(mat,1,sum)
Try both and see which is faster.

Also, in the lecture on transformation methods we mentioned sums of uniforms in order to generate normal random variables, exploiting the central limit theorem. The following R function demonstrates that convergence.

clt<-function (r = 3, cc = 2) 
{
        op <- par(mfrow = c(r, cc))
        n <- r * cc
        for (i in 1:n) {
                hist(matrix(runif(10000 * i), ncol = i) %*% rep(1, 
                        i), breaks = 30, main = paste("n=", i), 
                        xlab = "")
        }
        par(op)
}
clt()

Darren J Wilkinson Book: Stochastic Modelling for
	      Systems Biology  
darren.wilkinson@ncl.ac.uk
http://www.staff.ncl.ac.uk/d.j.wilkinson/