Why the Normal Function?

The Java applet above illustrates the kinds of situations that we often see and that turn out to involve the normal probability density functions. The "lightning" crossing the screen from left to right is aimed at the center of the brown target but there are many little errors introduced as it crosses the screen. Some of the errors push the lightning upward and some push it downward. The white bars against the black background keep track of how many lightning strikes have hit "each point" on the target. It is best to wait a few minutes and let this Java applet do its thing -- 1,000 lightning strikes. Notice that the record of strikes looks roughly like a normal probability density function. This is typical of situations like this in which the error in each trial is the accumulated result of many small error.

One of the more surprizing facts about situations involving chance is that whenever errors are the result of the accumulation of many small errors then the situation can be described by a normal probability density function. This fact is called the Central Limit theorem. We won't prove this theorem but we will look at some evidence for it. The first evidence is the simulation done by the Java applet above. You will eventually produce additional evidence in your computer algebra system window but first we need to develop some tools for working with random numbers. We begin with some examples.

Example

Suppose that a bakery sells donuts by the dozen. The weight of each donut is a random number. If

x1, x2, ... x12

are the weights of twelve donuts in a box then the sum

S = x1 + x2 + ... + x12

is another random number -- the weight of all the donuts in the box.

Example

Suppose that the radius of a pizza is a random number, R. Then its area, pi R2 is another random number.


  1. Suppose that x is a random number with pdf psi. What is the pdf for the random number

    y = -x

    answer

  2. Suppose that x is a random number with pdf psi. What is the pdf for the random number

    y = m x

    where m is a constant?

    answer

  3. Suppose that x is a random number with pdf psi. What is the pdf for the random number

    y = x + b

    where b is a constant?

    answer

  4. Suppose that x is a random number with mean a. What is the mean of the random number

    y = -x

    answer

  5. Suppose that x is a random number with mean a. What is the mean of the random number

    y = m x + b

    where m and b are constants?

  6. Suppose that x is a random number with variance V. What is the variance of the random number

    y = m x + b

    where m and b are constants?


We will eventually be interested in the sum of many small random numbers and we begin by looking at the sum of two random numbers. Suppose that x1 and x2 are two random numbers whose pdfs are

Missing equation

respectively. We are interested in the sum of independent random errors -- the kinds of errors that come from unrelated sources. So we assume that x1 and x2 are independent. We saw in the module Integration and Probability, II that the pdf for the random point

Missing equation

is

Missing equation

Therefore the probability that the random number y is between a and b -- that is, that

Missing equation

is

Missing equation

where

Missing equation

So

Missing equation

where the third line comes from the substitution

Missing equation

We have just proved the following theorem.

Theorem

Suppose that x1 and x2 are two independent random numbers with pdfs

Missing equation

The pdf for y = x1 + x2 is

Missing equation

The function psi defined in this way is so important and comes up in so many different situations that is has a name of its own -- the convolution of

Missing equation

and is written

Missing equation

The following example can help us get some feeling for all this.

Example

Suppose that x1 and x2 are independent random numbers between -1 and +1 and suppose their pdfs are uniform -- that is,

Missing equation

We want to look at the pdf for y = x1 + x2.

Missing equation

First, notice that because x1 and x2 are between -1 and +1, y will always be between -2 and +2.

The movie below can help us understand the convolution.

Missing movie

  • The blue graph at the top of the movie is the function psi1(s). If these graphs were drawn to scale the height of the blue graph would be 1/2

  • The yellow graph in the middle is the function psi2(y-s). Notice, for example, that when y=1 then s must be between 2 for this function to be 1. If these graphs were drawn to scale the height of the yellow graph would be 1/2

  • The green graph is the product psi1(s)psi2(y-s). This is the function we integrate to compute psi(y) -- in other words psi(y) is the area of the green rectangle. If these graphs were drawn to scale the height of the green graph would be 1/4


  1. Based on the movie above find

    Missing equation

    where

    Missing equation

  2. Find the same pdf by evaluating the integral.


The book Multivariable Calculus, Linear Algebra, and Differential Equations in a Real and Complex World contains a proof of the following theorem.

The proof is also available as an Adobe Acrobat PDF file by clicking here. This file may not display well on your monitor. If not, try printing it. The proof is simplified by assuming that the mean of both PDFs is zero and their variances are both one. This makes the proof notationally easier. If you do not have the Adobe Acrobat reader installed you may obtain it free by clicking the Get Acrobat Reader Button above.

Theorem

Suppose that x1 and x2 are independent random variables with normal pdfs

Missing equation

respectively. Then the pdf for y = x1 + x2 is the normal pdf

Missing equation

where

Missing equation

We are now in a position to look at some of the evidence that the normal probability density functions are the right probability density functions to use in situations where random errors result from the accumulation of many independent small random errors. The theorem above supports this claim in two ways.

Next we want to look at some experimental evidence supporting this same claim. The following theorem will be useful as we look at this evidence. Its proof can be found in Multivariable Calculus, Linear Algebra, and Differential Equations in a Real and Complex World.

The proof is also available as an Adobe Acrobat PDF file by clicking here. This file may not display well on your monitor. If not, try printing it. If you have not installed the Adobe Acrobat Reader you may obtain it free by clicking on the Get Acrobat Reader button above.

Theorem

If x and y are independent random numbers and

z = x + y

Then the variance of z is the sum of the variance of x and the variance of y and the mean of z is tthe sum of the mean of x and the mean of y.

You can now look at lots of experiemntal evidence in support of the claim that normal probability density functions describe random numbers that are the sum of many small independent random numbers. We begin with an example.

Example

Consider a random number x that can be described by the pdf

Missing equation Missing figure

We want to add n independent random numbers like x -- that is, they all have the same pdf but they are chosen independently and then summed. We will compare the pdfs of these sums with normal probability density functions. If you haven't already opened your CAS window do so now.

The kinds of integrals that come up in this situation can be difficult for the standard numerical integration routines used by computer algebra systems like Maple and Mathematica. For that reason your CAS window contains some special purpose procedures for working with sums of random numbers. Even these special purpose routines can be slow.

One way of speeding things up is to start with a pdf psi for a random number. Next we compute the convolution psi2 = psi * psi which is the pdf for the sum of two independent random numbers. Then we compute psi4 = psi2 * psi2 which is the pdf for the sum of four independent random numbers. Then we compute psi8 = psi4 * psi4 which is the pdf for the sum of eight independent random numbers.

Look at the example in your CAS window. Notice how for the example in your CAS window the pdf for the sum of eight or even four random numbers looks like a normal pdf. Next work out the example above. Notice again that the pdf for the sum of eight or even four random numbers looks like a normal pdf. Finally, try some examples of your own.

You will be able to see visually that the sum of a large number of independent random numbers is a random number whose pdf appears to be a normal function. You can check this more precisely using the work we did above to find the mean and variance of your sums of independent random numbers and then comparing the actual pdf with the normal pdf having the appropriate mean and variance.


Copyright c 1997 by Frank Wattenberg, Department of Mathematics, Montana State University, Bozeman, MT 59717