What's your mean?

Can you work out the means of these distributions using numerical methods?

Age

16 to 18

Challenge level

Being curious Being collaborative Being resourceful Being resilient

Problem

The probability density functions for two related, but unknown, distributions are given in the following accurately plotted chart.

It is known that the means of the distributions are whole numbers, and that the two pdfs only have a single turning point.

By numerically estimating the required integrals, what can you deduce with certainty about the two means?

Student Solutions

Probability Density Functions

The probability density function, or PDF, is a function which describes the probability of a random variable taking on certain values. For a continuous random variable, the probability that the variable lies between two values is given by the integral of the density function between these values.

We know that the sum of the probabilities of all possible outcomes is 1. So the integral of the PDF over all possible values of the variable is equal to 1.

We also need to know how to calculate the mean of the variable from the PDF. We first recall the definition of mean: $ \bar{x}=\Sigma x \,Pr(\hbox{X=x})$

Because the integral of the PDF gives us the probabilities of the variable occuring, the equation for the mean becomes $$ \bar{x}=\int xf(x) \,dx $$ where $f(x)$ is the density function.

Integrating using area approximation

We are now ready to find the means of our two PDFs. However, because we do not know their exact form we will have to approximate for the integrals.

For example, consider a variable with the distribution function as below.

We wish to calculate $ Pr(1/2 \leq X \leq 3/4) $ which we can find by calculating the area of the shaded rectangle:

$$ Pr(1/2 \leq X \leq 3/4) =\int^{3/4}_{1/2} \,dx= base \times height = ({3\over 4} - {1\over 2}) \times 1 = {1\over 4} $$

Red Line Mean

Applying the same idea to the red line in the problem, we can estimate the area under the curves using rectangles and trapeziums. Two such trapeziums are marked below in green.

To find the area of the trapezium, we use the result $ Area(trapezium) = {h \times (a+b) \over 2} $

This gives us the probability that our variable lies within the small trapezium of height 1. To find the mean, we then need to multiply this probability by the value of the variable in this interval. We approximate here, by using the midpoint of the trapezium height.

Take for example the above trapezium on the right, where the variable ranges from 10 to 11. We approximate by taking the value of the variable as 10.5, and mutiply this by the probability of the region to get the mean. The table below gives our estimates of these values.

h=	a=	b=	Area	Midpoint	Mean
0.5	0	0.01	0.005	0.75	0.00375
1	0.015	0.1	0.0575	1.5	0.08625
1	0.1	0.15	0.0125	2.5	0.3125
1	0.15	0.155	0.1525	3.5	0.53375
1	0.155	0.135	0.145	4.5	0.6525
1	0.135	0.12	0.1275	5.5	0.70125
1	0.12	0.085	0.1025	6.5	0.66625
1	0.085	0.06	0.0725	7.5	0.54375
1	0.06	0.045	0.0525	8.5	0.44625
1	0.045	0.035	0.04	9.5	0.38
1	0.035	0.025	0.03	10.5	0.315
1	0.025	0.02	0.0225	11.5	0.25875
1	0.02	0.015	0.0175	12.5	0.21875
1	0.015	0.01	0.0125	13.5	0.16875
1	0.01	0.01	0.01	14.5	0.145

The sum of the means in the right hand column is 5.4325. Because the question tells us the mean is an integer, we should also approximate the mean in the region 15 to 20.

As the probabilities in this range are so low, it is easier to approximate the area as a very flat rectangle. Remembering that the area under the PDF is the same as the probability of the variable being in that region, we find $$ Pr(15 \leq X \leq 20)=5 \times 0.005=0.025 $$ Again we use the midpoint approximation, and find $$ \bar{x}=17.5 \times 0.025 = 0.4375 $$

Summing over all the means, this gives us $ \bar{x}=5.4325 + 0.4375 = 5.87 \approx 6 $

We leave the grey line for you to compute. You might want to find an even closer estimation of the mean, and then find the relationship between the two PDFs.

Or search by topic

Number and algebra

Geometry and measure

Probability and statistics

Working mathematically

Advanced mathematics

For younger learners

What's your mean?

Problem

Getting Started

Student Solutions

Probability Density Functions

Integrating using area approximation

Red Line Mean

Teachers' Resources

Why do this problem?

Possible approach

Key questions

Possible extension

Possible support