Prof Darren Wilkinson Professor of Stochastic Modelling
School of Mathematics & Statistics
Newcastle University

An example of rejection sampling

We are first going to look at a simple example of rejection sampling of the random variable Z which has pdf f(z)=6z(1-z) on [0,1]. Note that Z has a Beta(2,2) distribution. Also note that f(z) has a maximum of 3/2 at 1/2. So to simulate Z we can simulate X from Unif(0,1) and Y from Unif(0,3/2) and accept X if Y<f(X).

Implementation in R

We can simulate a single observation of Z as follows.
rz1<-function () 
{
        repeat {
                x <- runif(1, 0, 1)
                y <- runif(1, 0, 3/2)
                fx <- 6 * x * (1 - x)
                if (y < fx) 
                        return(x)
        }
}
Note that we repeat the simulation block indefinitely, until the acceptance criterion is satisfied, at which point we break the loop by returning the accepted value.

We can now use this to simulate a vector of n such quantities as follows.

rz2<-function (n) 
{
        zvector <- vector("numeric", n)
        for (i in 1:n) {
                zvector[i] <- rz1()
        }
        zvector
}
Of course, this uses explicit looping, and hence will be slow in R. However, if we are a little bit cunning, we can vectorise this procedure too. We do this by simulating vectors of X and Y and keep only those which satisfy the acceptance criterion. Consider the following function.
rz3<-function (n) 
{
        xvec <- runif(n, 0, 1)
        yvec <- runif(n, 0, 3/2)
        fvec <- 6 * xvec * (1 - xvec)
        xvec[yvec < fvec]
}
The last line of the function body selects and returns those elements of the X vector satisfying the acceptance criterion. Consequently, the length of the returned vector will be considerably less than n. How much less will depend on the acceptance probability. Note that the length of the returned vector divided by n is an estimate of the acceptance probability. We can use this estimate in order to get an idea of how many simulations we should start off with in order to end up with a vector of about the length we want. Consider the following function.
rz4<-function (n) 
{
        x <- rz3(n)
        len <- length(x)
        aprob <- len/n
        shortby <- n - len
        n2 <- round(shortby/aprob)
        x2 <- rz3(n2)
        x3 <- c(x, x2)
        x3
}
This function calls rz3 to generate a vector of Zs. aprob is an estimate of the acceptance probability. shortby is how many more observations we would like, and n2 is an estimate of how many observations we need to start will to end up with shortby. The two resulting vectors are joined together and returned. So, this function will return a vector whose length is approximately n. This is good enough for most purposes, and because it is vectorised, is much faster than rz2.

The R source code is available here: rcirc.r

Implementation in C

The generation of a realisation of Z can be implemented as follows.
float genz()
{
  float x,y,fx;
  while (1)
    {
      x=urand(0,1);
      y=urand(0,1.5);
      fx=6.0*x*(1-x);
      if (y<fx)
        {
          break;
        }
     
    }
  return(x);
}

The full C source for this is here: circ.c

Other implementations

Those who are interested may also wish to look at corresponding code for LISP-STAT: circ.lsp,Python: circ.py and Sather: circ.sa.

Darren J Wilkinson Book: Stochastic Modelling for
	      Systems Biology  
darren.wilkinson@ncl.ac.uk
http://www.staff.ncl.ac.uk/d.j.wilkinson/