Prof Darren Wilkinson Professor of Stochastic Modelling

School of Mathematics & Statistics

Newcastle University

Exponential random variables via inverse CDF transformation

We know that if U is a Unif(0,1) random variable, then the random variable X=(-1/p)log(U) is an Exp(p) random variable. Since most computer languages come with a method of generating uniform random numbers, we can use these to generate exponential random quantities.

Implementation in R

R is the interactive language for statistical computing we are most interested in using in this course. In R, the function runif(1) returns a single Unif(0,1) random quantity. We can therefore write a function to generate a single exponential random quantity as follows.

myrexp1<-function (lambda) 
{
        u <- runif(1)
        x <- (-1/lambda) * log(u)
        x
}

The first line says that we are defining a function myrexp1 which has one argument: lambda. A variable, u is defined which is a Unif(0,1) random quantity. <- is the assignment operator in R (and so is _, which is slightly easier to type - you can use either). Next x is defined to be our exponential random variable, and the last line ensures that the function returns the value x. Now for example, typing myrexp1(2) will return a single Exp(2) random quantity.

Now suppose we would like a function which returns a vector of exponential random quantities. We could do this as follows.

myrexp2<-function (n, lambda) 
{
        xvector <- vector("numeric", n)
        for (i in 1:n) {
                xvector[i] <- myrexp1(lambda)
        }
        xvector
}

The first line of the function body creates xvector, a numeric vector of length n in which to store our random quantities. There is then a for loop which sets each element of our vector to be an exponential random quantity, using the function we have already defined. Finally, xvector is returned.

Logically, this is exactly how to create vectors of exponentials, but in R, you would never actually do it this way! The reason is that for loops are very slow in R. This often isn't a problem, because R is a vectorised language, and handles many vector operations very quickly. The way you would actually code it is as follows.

myrexp3<-function (n, lambda) 
{
        uvector <- runif(n)
        xvector <- (-1/lambda) * log(uvector)
        xvector
}

The first line of the function body creates a vector of n uniform random quantities. The next line forms the vector of logs, and then scales the resulting vector accordingly in order to obtain a vector exponentials, which the function then returns.

The two functions myrexp2 and myrexp3 both do the same job, but the second is much faster than the first. We can verify that these functions seem to work using the following R commands.

x <- myrexp3(10000,3)
mean(x)
var(x)
hist(x,30)

This creates a vector, x of 10,000 Exp(3) random quantities and looks at the mean and variance of the quantities and then produces a histogram of the quantities with approximately 30 bars. The mean should be about 1/3, the variance should be about 1/9, and the histogram should show exponential decay. We can do the same with myrexp2, but it will be much slower.

R source code for the functions is available here for downloading in the file: myrexp.r (shift-click to download). It can be loaded into R using the command source("myrexp.r"). If you've saved to drive H:\, you will probably want something like source("h:myrexp.r"). Alternatively, you can just copy-and-paste commands into R from another window (such as a Netscape or Notepad window).

R is a nice friendly interactive environment for developing statistical algorithms. You should always use this in preference to a compiled language for developing, testing and debugging your algorithms. If the algorithm can be vectorised (as here), so that explicit looping is not required, then there will be little to be gained from re-writing the code in a compiled language such as C. However, for the Markov chain algorithms we will develop later in the course, vectorisation will not be possible, and so the R code we develop will be very slow. In this case, once we have got our algorithms working in R, it may be worth re-writing the slow bits in C.

Implementation in C

Although there is nothing to be gained here from re-writing the code in C, it will be useful to get a feel for C with a nice simple example. C has a built in function, rand() which generates a random integer, so the first thing we need is a function to generate a Unif(0,1). We can write this as follows.

float urand()
{
  return( (float) rand()/RAND_MAX );
}

The first line says that we are defining a new function called urand which has no arguments, and which will return a real number (float). The function body simply returns a uniform random integer divided by its largest possible value, giving us a uniform number on (0,1).

We can now define a function which uses this to generate an exponential random quantity.

float genexp(float lambda)
{
  float u,x;
  u=urand();
  x=(-1/lambda)*log(u);
  return(x);
}

This function is has one argument which is a float called lambda. The function starts by declaring two local variables, which are both floats. u is assigned a Unif(0,1) (= is the assignment operator in C), then x is our exponential random quantity, which is returned.

This function can then be used by the main procedure.

void main(int argc, char *argv[])
{
  float e,lambda;
  int i,n;
  n=atoi(argv[1]);
  lambda=atof(argv[2]);
  for (i=0;i<n;i++)
    {
      e=genexp(lambda);
      printf(" %2.4f \n",e);
    }
}

void simply means that it is a procedure rather than a function, which does not return any value. Arguments n and lambda are passed on the command line. Command line argument are stored in argv[1], argv[2],... as strings. atoi converts a string to an integer, and atof converts a string to a float. There is then a for loop which executes n times, generating an exponential and printing it out each time through the loop.

If this code is compiled as an executable called a.out then calling it from the command line as a.out 1000 2 will cause it to print 1,000 Exp(2) random quantities to the screen. These can be redirected to a file and loaded into R as follows.

% a.out 1000 2 > test.dat
% R
> test_scan("test.dat")
> hist(test)

The full C source file with some additional information is available here: exp.c

Other implementations

Those who are interested may also want to look at corresponding code for LISP-STAT: exp-rand.lsp, Python: exp.py and Sather: exp.sa.

Darren J Wilkinson

darren.wilkinson@ncl.ac.uk

http://www.staff.ncl.ac.uk/d.j.wilkinson/

	Professor of Stochastic Modelling
	School of Mathematics & Statistics
	Newcastle University


		`darren.wilkinson@ncl.ac.uk`
		`http://www.staff.ncl.ac.uk/d.j.wilkinson/`