Into the Exponential Distribution

Age

16 to 18

Challenge level

Get into the exponential distribution through an exploration of its pdf.

Problem

This chart shows the probability density functions for two exponential distributions. Here are some questions to consider concerning these sorts of distributions. You can use a mixture of probability-based reasoning and algebraic reasoning based on the properties of integrals and areas.

1. Which exponential distributions do these curves correspond to? What interesting mathematical properties do they have? Which has the largest mean?

2. Two separate areas are enclosed by the red and blue curves: one to the left of the point of intersection and one to the right. What can you say about the sizes of these two areas? Can you give a good explanation?

3. Can you find the point of intersection of the pdfs? Can you find the area of each of the enclosed parts of the diagrams?

4. From the graphs, I visually estimated that $P(0.5 < Red < 0.7) = 0.125$. Can you work out how I made this estimate? Is it an over-estimate or an underestimate? How close do you think that estimate is to the correct answer? Work out the correct answer to find out.

5. Try to make, assess and check your own estimates similar to that in question 4.

Student Solutions

Part 1:

The exponential distribution describes the time between independent events which occur continuously at a constant average rate. The probability distribution function of an exponential distribution is given by $f(x) = \lambda e^{-\lambda x}$. This is defined for $ x\geq 0 $, where $ \lambda $ is some parameter of the distribution.

We first note that for larger values of $ \lambda $, the gradient of the PDF is greater. Thus the parameter of the red curve, $\lambda_{Red}$ is greater than the parameter of the blue curve, $\lambda_{Blue}$.

To find the value of the constant $\lambda$ we can use boundary conditions.

At x=0 on the red curve, we can see that f(x) = f(0) = 2

$\lambda e^{0} = \lambda = 2$

f(x) =$2e^{-2x}$

And at x = 0 on the blue curve, we can see that f(x) = f(0) = 1

$\lambda e^0 = \lambda = 1 $

f(x) =$e^{-x}$

Thus $\lambda_{Red}=2$ and $\lambda_{Blue}=1$, and $ \lambda_{Red}> \lambda_{Blue} $ as expected.

To find the mean of the exponential distribution we use the formula $$ \bar{x}=\int xf(x) \,dx $$

This gives $\bar{x}=\frac{1}{\lambda}$. So the mean is larger for smaller values of $\lambda$, which implies the blue curve has the larger mean.

The parameter $\lambda$ is sometimes called the rate parameter, which determines the constant average rate at which the events occur. Thus we can interpret the mean in terms of the rate parameter. For example, consider our variable to be the waiting time for a bus to arrive. If the bus arrives on average four times every hour, then we expect to wait 15 minutes for a bus.

Interestingly, the exponential function is the only continuous memoryless function. How would we show this? First try a simple exercise and see if you can confirm the red curve is memoryless by estimating the probabilities using area under the curve. Then consider $$ Pr(Z\geq x+y\mid Z\geq x)=Pr(Z\geq y) $$

Part 2:

The two separate areas enclosed between the red and blue curves are of equal magnitude. The total area under any probability density function is the sum of all probaibilitis which must equal 1. If we define the common area enclosed by both the blue and red curve as A it can be seen that:

Area(between red and blue curves) = Area(below red curve) - A= 1- A

Area(between red and blue curves) = Area(below blue curve) - A = 1 - A

Hence the areas are equal, the areas both equal 0.25

Part 3:

To find the point of intersection we can equate the two PDFs and solve for x.

$2e^{-2x} = e^{-x}$

x = ln(2), f(x) = f(ln2) = 0.5

Area Enclosed between red and blue = $\int_0^{ln2} 2e^{-2x} - e{-x} dx = \frac{-1}{4} + \frac{1}{2} = \frac{1}{4}$

Area Enclosed between blue and red = $\int_{ln2}^{\infty} e^{-x} -2e^{-2x} dx = \frac{1}{2} - \frac{1}{4} = \frac{1}{4}$

Part 4:

$P(0.5 < Red < 0.7)$ can be estimated by the area of a trapezium.

$$Area(trapezium)={1\over 2} (a+b) h = 0.5 ( 2e^{-1} +2e^{-1.4}) (0.2) = 0.122895281...$$

Since f(x) is convex, this is an overestimate of the probability. We can achieve a closer estimate by splitting the area up into a series of trapeziums and summing all areas to give a total probability. The more trapeziums we divide the area into the more accurate the estimate becomes.

If we were to divide the area into an infinite number of trapeziums and then sum the areas we would have an integral and the exact probability would hence be obtained. This method gives $$P(0.5 < Red < 0.7) = \int_{0.5}^{0.7} 2e^{-2x}dx = e^{-1} -e^{-1.4} \approx 0.12128$$

Or search by topic

Number and algebra

Geometry and measure

Probability and statistics

Working mathematically

Advanced mathematics

For younger learners

Into the Exponential Distribution

Problem

Getting Started

Student Solutions

Teachers' Resources

Why do this problem?

Possible approach

Key questions

Possible extension

Possible support