Solution

154070

First name
Weida Liao
School
Churston Ferrers Grammar School
Age
17
Email address
u10liaow@cfgslive.com

What is the probability that someone else has the same number as you?

Notice that for someone to have a different number to me, they can have any number other than the one I have chosen, regardless of whether or not someone else (besides me) has that number. The probability that someone has a different number to me is given by
P(person has different number to me) = 1 - P(person has same number as me) = 1 - 1/225 = 224/225

There are 29 other people. We multiply the probability that someone has a different number to me by itself 29 times as we need the first person (other than me) to have a number different to mine AND the second person to have a number different to mine AND the third, and so on. Therefore
P(someone else has same number as me) = 1 - P(everyone else has a different number to me)
= 1 - (224/225)^29

How likely do you think it is that there will be at least one match amongst the 30 people in the database?

Intuitively, it seems that because there is a fairly low chance that someone has the same number as me, the chance that there will be at least one match amongst 30 people is also low. However, this is not sound reasoning, as we will see.

What is this probability?

It is easier (although this is subjective) to find the probability that there are no matches in the database, and reason that
P(there is at least one match amongst the 30 people) = 1 - P(no matches amongst the 30 people)

When calculating probability that there are no matches amongst the 30 people, notice that this time if we consider the people in order, the next person cannot have the same number as any of the people who came before, unlike when we calculated the probability that every other individual than me did not have the same number as me. Thus P(second person’s number does not match first person’s) = (225-1)/225 = 224/225,
P(third person’s number does not match first or second person’s) = (225-2)/225 = 223/225,
P(fourth person’s number does not match first, second or third person’s) = (225-3)/225 = 222/225,
and so on, up to
P(thirtieth person’s number does not match anyone else’s) = (225-29)/225 = 196/225

Therefore
P(no matches amongst the 30 people) = 224/225 * 223/225 * 222/225 * … * 197/225 * 196/225
= (224*223*...*197*196)/225^29
= 224!/(225^29 * 195!)
(as there are 29 terms in the multiplication)

Therefore
P(at least one match amongst the 30 people) = 1 - 224!/(225^29 * 195!)
which is approximately 86.8%

Why is it so much more likely that two people will share the same number than someone sharing your number?
The probability that everyone has a different number to each other is much smaller than the probability that everyone has a different number to me, because in the first case adding another person eliminates one number from the collection of numbers that the next person can have, but in the second case each other person can have any of the 224 numbers that I did not choose.

Does this help to explain why so many pairs were found in the Arizona database?
Yes - while the chance that someone has the same DNA profile as a particular sample is very low, the chance that some pair of people in the database (i.e. 1-(probability that everyone in database has a different DNA profile)) have the same DNA profile is much higher.