Formula for Probability distribution


By Brad Rodgers (P1930) on Saturday, June 9, 2001 - 01:02 am :

What is the formula for the probability that, for a coin, out of n flips, x tails will be flipped, assuming equal probability of heads and tails?

Thanks,

Brad


By Dan Goodman (Dfmg2) on Saturday, June 9, 2001 - 01:28 am :

Brad, I'll give you a hint. Firstly, it's no more difficult to solve this problem with the probability of a head being p and the probability of a tail being q (and p+q=1). Suppose n=5 and x=2. What is the probability of HHTTH? What about HTHHT? How many strings of length 5 consisting of H and T with two T's in it are there?

Alternatively, think about expanding (p+q)n .

Did you get it?


By Brad Rodgers (P1930) on Sunday, June 10, 2001 - 12:54 am :

Ok, thanks. I think that the formula is (number of ways of placing x T's in n spaces)/2n
=n!/(x!(n-x)!2x ). That should have been evident to me at start, but for some reason I had a mental block of some kind. Anyways, I recently read that one can find a curve (what one would assume is the Bell curve) and integrate under that curve from x to y to find the probability that there will be anywhere from x to y heads flipped. What is this curve? The best I've gotten thus far deals with Psi, and doesn't fall exactly under the criteria I gave above.

Thanks very much,

Brad


By Dan Goodman (Dfmg2) on Sunday, June 10, 2001 - 01:43 am :
Yup, it's a very common probability distribution, called (unsurprisingly) the binomial distribution. If X is the number of heads where a head has probability p and there are n tosses, then we say X is binomial with parameters n, p or in short X ~ B(n,p). Then P(X=r) = nCrpr qn-r where nCr=n!/(r!(n-r)!) and q=1-p.

Do you know what the normal distribution is? X is a normal random variable with mean m and variance s2 if
P(a < X < b)=òab (1/
Ö
 

2ps2
 
)e-(x-m)2/2s2 dx

, in which case we write X ~ N(m,s2). It turns out that for large values of n p (n p > 30 is usually enough I think, so for p=0.5 you need to be doing 60 tosses or more to get good results), the binomial distribution B(n,p) is closely approximated by the normal distribution N(n p,n p(1-p)). So the probability of between a and b heads is roughly


òab(1/(
Ö
 

2p
 
n p(1-p)))e-(x-n p)2/(2(n p(1-p))2)dx

.

The approximation gets better the larger n p is.

There are tables which give
F(x):=ò-¥x (1/
Ö
 

2p
 
)e-x2/2 dx

for various values of x. So you can compute P(a < X < b) using P(a < X < b)=F((b-n p)/(n p(1-p)))-F((a-n p)/(n p(1-p))) (just use a change of variables).

That's quite tricky stuff, especially if you haven't come across the normal distribution before, so let me know if you didn't follow.


By Dave Sheridan (Dms22) on Monday, June 11, 2001 - 02:32 pm :

To add to this, firstly you should never use the nCr notation - it's evil and will be frowned upon by all good probabilists. But that's by the by.
[A debate ensued about this - Dave arguing that in complicated equations n Cr can easily be mis-read, and others arguing that (n r ) is rather harder to typeset. - The Editor]

Secondly, it's true that all probabilities can be worked out by integrating between a and b, but only if you know some rather complicated measure theory. For the Normal distribution, we have a continuous density function so integrating this is simple (at least in theory) and explainable.

The binomial distribution is discrete, which means that it only takes at most countably many values (in this case, it's integers between 0 and n inclusive). So P(X < n) is not equal to P(X < =n). Integrals of continuous functions are continuous, so the binomial distribution does not have a continuous density. However, this does not mean you can't integrate something to get the distribution, it just means that densities are not the most general way to define integrals. In fact, densities simply mean we take Lebesgue measure (ie dx) and change it in a continuous way. Clearly this can't be done for a discrete distribution. The "density" for a binomial can be thought of as the sum of n+1 delta functions, with weights equal to the relevant probabilities. This turns out to be the wrong way to look at things though - instead of considering a "density" we consider a "measure", ie we write "dB" instead of "dx" and define what integration with respect to the Binomial distribution means.

If you're interested in this, I can explain more. However, it's getting off the original point somewhat...

-Dave


By Brad Rodgers (P1930) on Wednesday, June 13, 2001 - 02:07 am :

Sorry to take so long to write back, but what does mean and variance stand for?

Brad


By Dan Goodman (Dfmg2) on Wednesday, June 13, 2001 - 03:01 am :
Hi Brad.

The mean of a distribution is one of the äverages". If X takes value xi with probability pi then the mean or expectation of X is
E[X]= å
xi pi

. So, for example if X takes values x1 to xn with equal probability then E[X]=(x1 +¼+xn )/n, which is just the average of n numbers that you've already come across. If X is a continuous random variable (like the Normal distribution above) then E[X]=òx.f(x)dx where f(x) is the probability density function, i.e. Prob[a < X < b]=òab f(x)dx.

The variance of a random variable X is defined to be Var[X]=E[(X-E[X])2 ], which can be expanded out to give E[X2 ]-E[X]2. The variance gives you an idea of how spread out the distribution is. A random variable with small variance will be concentrated around the mean, whereas a random variable with large variance will be much more spread out.

There are various basic properties of the mean and variance, such as E[a X+b]=a E[x]+b and Var[a X+b]=a2 Var[X], E[X+Y]=E[X]+E[Y], and if X and Y are independent then E[X Y]=E[X]E[Y].

Have you done any probability or stats? If you can give us an idea of what you know about probability and stats that would be helpful.


By Brad Rodgers (P1930) on Wednesday, June 13, 2001 - 05:20 am :

Actually, I have had very little probability thus far, the only exception being a flawed chapter in geometry on the probability of areas. I do have a book on the subject, but it still is quite advanced. The above post answers all of the questions I had, though.

Thanks,

Brad


By Dave Sheridan (Dms22) on Wednesday, June 13, 2001 - 10:59 am :

Something which you might find useful is the law of large numbers. This says that if the mean is finite, the average of a large number of trials asymptotically tends to the mean.

For example, if I toss a coin repeatedly and work out the average number of heads obtained, it ends up pretty close to a half. I can't tell how long it'll take until I stay within 0.001 of a half but I know that eventually the average won't stray outside of the range 0.499-0.501.

This idea might help you to intuitively see what means represent. The other thing you should know is that although E is linear (as Dan pointed out), it does not preserve other transformations. For example, E(X2 ) > = (E(X))2 with equality only when X is a constant. Thus variance is always nonnegative. In general though we can't compare E(f(X)) with f(E(X)) unless f is convex or there's some neat property of f.

-Dave