Into the normal distribution
Investigate the normal distribution
Image
This chart accurately shows the probability density functions for three normal distributions.
I estimate that the probability that a variable drawn from the blue distribution is negative is $0.25$. Can you suggest how I made this estimate? Is it likely to be an over- or under-estimate?
How about the probabilities that random variables drawn from the other distributions are negative?
My friend says that the means and variances of these normal distributions are whole numbers. Can you use your estimates to work out the values for these means and variances, using just tables and estimates? Can you specify the values exactly, or can you only specify a certain range of possibilities?
You might like to recall that the pdf of an $N(\mu, \sigma^2)$ random variable is
$$
f(x) =\frac{1}{\sqrt{2\sigma ^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}
$$
and that the probability of a value falling between $a$ and $b$ is
$$
P(a< x< b) = \int^b_a f(y)dy
$$
Don't forget that the integral is the area under the curve between two points.
Don't forget that you can look up the cumulative probabilities for a normal distribution using tables.
We can estimate the probability of selecting a negative random variable by evaluating the area under the curve in the region of negative x. Approximating this area using a triangle will give us an over-estimate of the actual probability.
Blue Curve: Pr(X< 0) $\approx$ 0.5 x 2 x 0.25 = 0.25
Red Curve: Pr(X< 0) $\approx$ 0.5 x 2 x 0.1 = 0.1
Black Curve: Pr(X< 0) $\approx$ 0.5 x 2 x 0.05 = 0.05
A normal distribution is symmetric about its mean. This allows us to estimate the mean of each distribution by inspection:
$\mu_{Blue}$ = 1
$\mu_{Red}$ = 2
$\mu_{Black}$ = 3
We know that f(x) = $\frac{1}{ \sigma \sqrt{2 \pi}} e^{\frac{-(x-\mu)^2}{2 \sigma ^2}}$
If we evaluate f(x) at x = $\mu$ the exponential will disappear ($e^0 = 1$)
We can then solve for $\sigma$
$f(\mu) = \frac{1}{ \sigma \sqrt{2 \pi}} $
$\sigma =\frac{1}{\sqrt{2 \pi} f(\mu)}$
Evaluating f($\mu$) from the curves and substituting $\mu$ into the expression we find that:
$\mu_{Blue}$ = 1, $\sigma^2_{Blue} = 1$
$\mu_{Red}$ = 2, $\sigma^2_{Red} = 2$
$\mu_{Black}$ = 3, $\sigma^2_{Black} = 3$
Why do this problem?
This problem is based around understanding the probability density function for the normal distribution. The aim is to draw the learner into an understanding of the properties of pdfs without requiring too many complicated calculations: it uses and will reinforce ideas about functions, integration and areas and the use of tables to calculate the probabilities for standardised normal distributions. It will also suit self-motivated independent learners.Possible approach
This question could sensibly be used once students are starting to learn about the use of normal distribution tables and standardised normal distributions. There is a lot of scope for numerical estimation of probabilities and the first part could be used to reinforce the fact that a probability density function tells us quite a lot about a distribution even without the need for complicated
calculation. It will tie in nicely with other parts of the syllabus on numerical integration.
Key questions
What do you know about the area under a pdf?
What does the area under a pdf between two points mean?
How might we write down our probabilistic statements in terms of standardised normal variables?
Possible extension
Can learners find any of the areas enclosed by the lines in the diagrams (using normal distribution tables).
Can they find the points of intersection on the graph?
Possible support
Encourage learners to rely on their intuitive underestanding of integration in terms of area. Alternatively, focus on the last two parts of the question as a discussion. If they can't come up with their own suggestions of calculation, perhaps they might initially check the estimates of others?