Bayes' Theorem


By Olof Sisask on Tuesday, June 19, 2001 - 04:19 pm :

Hi,

Could someone explain Bayes' theorem to me? I've seen it mentioned in a couple of STEP-like questions.

Thanks,
Olof


By Narendra Pathmanathan on Tuesday, June 19, 2001 - 06:33 pm :

Bayes' theorem is a direct consequence from the definition of conditional probability. The definition of conditional probability is:

P(A|B)=P(AnB)/P(B)
(AnB is A intersection B).

So from this it is clear that
P(AnB)=P(A|B) x P(B) and P(BnA)=P(B|A) x P(A).

As P(AnB)=P(BnA) then,
P(A|B) x P(B)=P(B|A) x P(A) so if you divide through by P(B) you arrive at
P(A|B)=(P(B|A) x P(A))/P(B),
the statement of which is Bayes' theorem.


By Olof Sisask on Tuesday, June 19, 2001 - 07:44 pm :

Sorry what does P(A|B) mean?

Olof


By William Astle on Tuesday, June 19, 2001 - 08:21 pm :

Allowing for a little intuition in the definition of 'probability:'

- For a general set of events S, P(S) is 'the probability that an event in S will occur.' So P(AnB) is 'the probability that an event in both A and B will occur'.

- P(A|B) is the probability that A happens given that B happens. P(A|B)=P(AnB)/P(B) is the definition of the conditional probability of A given B. This is a sort of renormalisation. We have taken all the probability mass not due to B and ignored it (this is what the numerator in the definition 'does') and renormalised the result according to B (this is what the denominator 'does').

As you would expect, then you have:

P(B|B)=1.
P(A|B)=0 if A and B are disjoint.


By Olof Sisask on Tuesday, June 19, 2001 - 09:33 pm :

I see. I haven't done much probability work in the past, as you may have noticed :-).
Just out of curiosity, how are the laws of probability rigorously proved? Or are they axioms?

Cheers,
Olof


By Arun Iyer on Wednesday, June 20, 2001 - 07:48 pm :

All of these proofs are based on the main definition of probability

P(Event) = No.of favourable cases /Total no. of cases


By Kerwin Hui on Wednesday, June 20, 2001 - 08:06 pm :
Not quite... There are three axioms of probability.

Given a sample space, Ω, an event ω is a subset of Ω. Then probability P is a function ΩR such that

Axiom I: 0P(ω) for all ωΩ

Axiom II: P(Ω)=1

Axiom III: If A1 , A2 , ... are mutually exclusive events, i.e. Ai Aj = for ij, then P( i=1 Ai )= i=1 P( Ai )

Kerwin


By Michael Doré on Wednesday, June 20, 2001 - 08:08 pm :

A sample space, incidentally, is the set of possible outcomes of an experiment.


By Olof Sisask on Wednesday, June 20, 2001 - 11:53 pm :

Thanks. With the third axiom - what does the very last bit represent? [ P(Ui=1 ... ]. I take it that it is this axiom that allows you to say such things as "The probability of A occuring followed by B, where A and B are independent, is P(A) x P(B)"? Although this might just be based on logic? Come to think of it, I'm not quite sure how far axioms have to stretch.

Thanks,
Olof


By James Lingard on Thursday, June 21, 2001 - 12:22 am :

Olof,

The last bit is just saying that the probability of a countable union of mutually exclusive events is just the sum of the probabilities of the individual events, which is what you said.

The axioms are supposed to cover even things which are 'intuitive'. Everything in probability should be a consequence of the axioms; you shouldn't have to rely on 'logic' as you put it. In fact, there's no reason at all why you should have to think of the axioms being related to our intuitive feeling of what probability is -- in one (not particularly helpful) sense they just define abstract mathematical objects, and you should be able to reason about them in a purely abstract way.

James.


By Michael Doré on Thursday, June 21, 2001 - 12:58 am :

"The probability of A occuring followed by B, where A and B are independent, is P(A) x P(B)"

This is not one of the axioms. It is actually the definition of the term "independent events". Under this definition all events which are causally disconnected are independent, but the converse isn't true.

Take for example two coins. Let's say that you "win" if and only if the coins show different faces. OK, now consider the two events:

1) The second coin shows a head.
2) You win.

Now events 1 and 2 are clearly causally connected (in fact 1 determines whether or not 2 occurs) and yet 1 and 2 are independent since P(1) = P(2) = 1/2 and P(1 and 2) = 1/4.


By James Lingard on Thursday, June 21, 2001 - 02:02 am :

Yes, sorry, wasn't thinking.

James.


By Olof Sisask on Thursday, June 21, 2001 - 02:40 pm :

I'm not really up-to-date with all my probabilities :). I think I was thinking of independent, rather than mutually exclusive, as you said Michael.
Basically, I'm trying to see how you deduce the rule P(A) x P(B) for P(A and B).

Thanks,
Olof


By Michael Doré on Thursday, June 21, 2001 - 03:51 pm :

Ah, but this is the point. You can't. It is not always true. For example, suppose you toss two coins. Let A be the event that the first coin is a head and B be the event that both coins are heads.

Then P(A and B) = P(both coins head) = 1/4.
But P(A) = 1/2 and P(B) = 1/4 so P(A) x P(B) = 1/8.

Hence P(A and B) = P(A) x P(B) is violated.

Events for which the P(A and B) = P(A) x P(B) hold are defined as independent; so in the above example events A and B are not independent.

What you may be wondering is how to show that two events which are causally disconnected are independent (i.e. show they satisfy the independence relation). This is quite a tricky topic since "causally disconnected" is difficult to define.


By Olof Sisask on Thursday, June 21, 2001 - 05:33 pm :

Yeah sorry I should have added 'Where A and B are independent' at the end. But as this seems to be the definition of two independent events, it's a bit of an empty question really :).
I think I'll just leave the question for a later time, rather than going into it now during exam period. The annoying thing is that it's always during exam periods that I start wondering how rigorously I can show something [which is why I started wondering about P(A and B) = P(A) x P(B) for independent events].

Thanks anyway!
Olof