Independence sampling and transformation methods
This week we will look at writing software to simulate random
variables from standard distributions. First read carefully
through the following two examples, trying them out as you go
along, then tackle the exercises below.
Exercises
- 1.
- Download the R source code for generation of exponential
random varaibles, and test for yourself the difference in execution
times for the different implementations.
- 2.
- Download and test the R code for generation of Normal
random variables. Write your own non-vectorised version of the
Box-Muller algorithm in R using an explicit "for" loop for the
generation of a vector of standard Normal random variables. Note how
much slower it is than the vectorised version.
- 3.
- Write a "wrapper" function which generates vectors of Normal
random variables with arbitrary mean and variance. A "wrapper"
function is just a very short function which calls another
function in order to do most of its work. So, your function
should just call your standard normal generator to get a
vector of standard normal variates, and then re-scale the
elements appropriately to give a vector with the correct mean
and variance.
- 4.
- If X(1), X(2), ..., X(p) are independent standard
Normal random variables, then Z=X(1)*X(1)+X(2)*X(2)+ .... +
X(p)*X(p) is a Chi-squared random variable with p degrees
of freedom. Write some R code to generate vectors of Chi-squared
random variables. See the hints below!
- 5.
- The Cauchy distribution has density function
f(x)=1/(Pi*(1+x*x)). Use the transformation method to
generate variates from this distribution. Hint: in R and C,
the tan function is tan(). Also note that the
Cauchy distribution is a long-tailed distribution, so you
would expect to get a few very large or small values in your
sample. The effect of this is that if you do a histogram of
a sample of Cauchy variables, you often get a big spike in the
middle. One way around this is to just do a histogram of the
values which aren't too extreme. So, if your sample is in
x, rather than doing, hist(x,30), do
something like hist(x[abs(x)<10],30).
- 6. (for postgrads)
- See if you can download, compile, and run the C code for
generation of exponential and Normal random variables (best done on a
Unix box). Verify that it
offers little in the way of performance gains over the vectorised R
code. Extend the C code to generate Chi-squared and Cauchy random variables.
Hints and Tips
Read the hints and tips page!
For the Chi-squared question, the following may be useful!
m<-matrix(rnorm(n*p),ncol=p) # creates a matrix with p columns
mat<-m*m # "*" is element-wise product
mat %*% rep(1,p) # "%*%" is matrix multiplication
An alternative to the last line is
apply(mat,1,sum)
Try both and see which is faster.
Also, in the lecture on transformation methods we mentioned sums of
uniforms in order to generate normal random variables,
exploiting the central limit theorem. The following R function
demonstrates that convergence.
clt<-function (r = 3, cc = 2)
{
op <- par(mfrow = c(r, cc))
n <- r * cc
for (i in 1:n) {
hist(matrix(runif(10000 * i), ncol = i) %*% rep(1,
i), breaks = 30, main = paste("n=", i),
xlab = "")
}
par(op)
}
clt()