Chris Tynan
Posted on Saturday, 18 January, 2003 - 12:14 pm:

I've just finished S1 in class and my teacher was explaining the normal distribution last week.

Whilst most people were unable to follow the theory, I understood most of it except when he casually remarked that the distance between μ and the point of inflexion on the distribution is approximately σ
I asked him about his later and he provided no real explanation.

Is it coincidence or is there a mathematical explanation/proof?

Thanks

Chris

David Loeffler
Posted on Saturday, 18 January, 2003 - 01:49 pm:

Never mind 'approximately' - this is precisely true. Have a go at proving it - can you see why it suffices to consider the basic N(0,1) distribution?

But I can't think of any deep reason why it should be so. The higher derivatives of the density function have no particular significance in probability theory as far as I'm aware.

David
Andre Rzym
Posted on Saturday, 18 January, 2003 - 01:59 pm:

If I'm not mistaken, it's exactly σ. You can get this by differentiating the density function twice.

Andre

Paul Smith
Posted on Saturday, 18 January, 2003 - 02:07 pm:

This is assuming you know the pdf of the normal distribution:

ϕ(x)=1/{σ2π}.exp[-{(x-μ)/σ }2 /2], which simplifies to ϕ(x)=1/2π.exp(- x2 /2) for N(0,1).

Paul

Chris Tynan
Posted on Sunday, 19 January, 2003 - 12:33 pm:

I'm sorry, but I still don't understand why it should be so.

I differentiated the expression Paul gave twice, but I couldn't see any significance about it.

Also, I thought ϕ(x)= - x 1/π.exp((- x2 )/2)

Thanks,

Chris

Paul Smith
Posted on Sunday, 19 January, 2003 - 12:46 pm:

Chris, the integral you give is actually Phi(x) - the cumulative distribution function. We're only concerned with its derivative, the probability distribution function ϕ(x). For a point of inflexion the second derivative (of the pdf) must be zero (and some other conditions must be met, which I shall casually ignore). When you set the second derivative to zero you should find that {(x-μ)/σ }2 =1, which is equivalent to x-μ=±σ.

Paul

Chris Tynan
Posted on Sunday, 19 January, 2003 - 12:51 pm:

In that case, I've obviously differentiated wrongly. Could you please go through the two derivatives?

Thanks Paul

Chris
Paul Smith
Posted on Sunday, 19 January, 2003 - 01:41 pm:

Certainly.

[See below]

Let us know if you need the derivatives broken down further.

Paul
Paul Smith
Posted on Sunday, 19 January, 2003 - 01:50 pm:

Derivative

Paul
Chris Tynan
Posted on Sunday, 19 January, 2003 - 01:57 pm:

Thanks Paul, I get it now.

Chris
David Loeffler
Posted on Sunday, 19 January, 2003 - 03:14 pm:

Just a note: as I hinted above, you can save some of this work by just proving it with a standard normal distribution with μ=0 and σ=1. Because all normal distributions are obtained from this by translation and scaling, it is true in all cases. Because the algebra is much simpler, you are far less likely to make errors in your working.

David

Paul Smith
Posted on Sunday, 19 January, 2003 - 04:56 pm:

Yes, you're quite right, David; sorry---I missed that!

Paul
Andre Rzym
Posted on Monday, 20 January, 2003 - 09:16 am:

If you are looking for 'interesting features' of the normal density function, here's another:

Suppose we have a probability density function p(x). Clearly we require

p(x)dx=1

Now impose the additional constraints that the mean and standard deviation of x are m and v.

Clearly there are an infinitude of functions that meet these criteria. Now it turns out that for any probability distribution, it is possible to define a measure of the 'disorder' (also known as the entropy of the distribution) by

S=-p(x).ln(p(x)).dx

The higher the entropy, the less 'structure' the distribution has.

Now suppose solve for a p(x) to maximise S, subject to the constraints of normalisation, m, v.

Guess what distribution pops out? It's a normal distribution!

What this is saying is that, if someone told you the mean and variance of a distribution, p, (actually there's a small detail about translational symmetry as well but let's ignore that) and nothing else, and you have to make an assumption on the shape of the distribution, then you ought to assume a normal distribution for p. Anything else is to impose additional structure on the distribution for which you have no evidence.

Andre